Many traditional signal processing techniques and machine learning utilize shallow architectures which consist of a single layer of non-linear feature transformation. Examples of shallow models are nonlinear or linear dynamic models, conditional random models, maximum entropy models, markov hidden models, maximum entropy models, multilayer perceptron and kernel regression with only one hidden layer. A property mutual to these shallow architectures models are simple architecture which consists of only one layer responsible for altering the basic input signals into a problem specific feature space, which we can’t observe. The deep learning paradigm tackles problems on which shallow architectures (e.g. SVM) are altered by the express of dimensionality. Some Part of a two stage method learning involving many layers of nonlinear processing a set of statistically substantial characteristics are automatically extracted from data. Deep learning method can be used in applications like remote sensing such as Land cover Classification, Detection of Vehicle in Satellite Images, Hyper spectral Image classification.