Reading as a god

Chapter 249 Outrageous Technology

Chapter 249 Outrageous Technology

Gait recognition technology is difficult to implement not only because of the complexity of building models.

When you haven't seen a person for a long time, especially a child, it is difficult for the human eye to recognize the other person due to the large changes.

The same is true in the field of machine vision. If the recognition feature points change too much, it will affect the recognition accuracy. For example, when a person is a child and grows up, the facial features change will be particularly obvious, and it will be difficult for people to recognize them by face recognition. .

The same holds true for gait recognition.

If the growth changes within a few years, effective identification can still be carried out due to the small change in the feature points, but if the feature points have undergone very obvious changes in more than ten years, then there will be no way to effectively identify them.

Although in fact, such use technology will basically not appear in practical applications. The purpose of human beings to use technology for production and life is to improve work efficiency and improve the quality of life. According to the principle of maximizing benefits, human beings will spontaneously let tools play maximum potency.Use the most appropriate tool at the right time, at the right place, and at the right scene, and it is the most effective part of the tool, using its strengths and discarding its weaknesses.

Therefore, in order for machine vision tools to play the most effective role, human beings will dynamically collect information according to actual needs, and constantly update the latest data so that the corresponding technology can achieve the best results and promote the development of social production and life. Eternal jokes.

However, the gait recognition provided by the system can basically derive the gait that the person may have in the rest of his life through core calculations after one recognition.

Zhang Shan didn't know how to describe this coquettish operation.

The most abnormal thing is the gait recognition provided by the system, which also has the function of gesture recognition.

Gait recognition is a biometric identification technology that is mainly based on people's walking posture (the extracted feature points also include hundreds of identification elements such as body shape characteristics, muscle strength characteristics, and head shape).Gait recognition is closely related to identity. It can identify the identity of the target person through gait, and can be applied to scenarios such as criminal investigation and suspect retrieval.

Gesture recognition can realize the estimation of human body movements, finger movements and other postures, which is very important for describing human body posture and predicting human behavior.It is mainly based on the observation of key nodes of the human body, such as bones and joints.Gesture recognition has nothing to do with identity, and can be applied to scenarios such as fall detection, virtual fitting, and somatosensory games by studying human posture.

These all have high requirements for deep learning.

Deep learning (English: deep learning) is a branch of machine learning. It is an algorithm that uses artificial neural networks as the framework to perform representation learning on data.

Deep learning is an algorithm based on representation learning of data in machine learning.Observations (such as an image) can be represented in a variety of ways, such as a vector of intensity values ​​for each pixel, or more abstractly as a series of edges, regions of a specific shape, etc.Instead, it is easier to learn tasks from examples (e.g., face recognition or facial expression recognition) using certain representations.The advantage of deep learning is to use unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms to replace manual feature acquisition.

The goal of representation learning is to seek better representations and create better models to learn these representations from large-scale unlabeled data.Representation methods come from neuroscience and are loosely built on information processing in similar nervous systems and an understanding of communication patterns, such as neural codes, which attempt to define the relationship between the responses that pull neurons and the electrical activity of neurons in the brain The relationship between.

So far, there have been several deep learning frameworks, such as deep neural network, convolutional neural network, deep belief network and recurrent neural network, which have been applied in the fields of computer vision, speech recognition, natural language processing, audio recognition and bioinformatics. Excellent effect.

Plus, "deep learning" has become a buzzword, or a rebranding of artificial neural networks.

Deep learning frameworks, especially those based on artificial neural networks, can be traced back to the new cognitive machine proposed by Kunihiko Fukushima in 1980, while artificial neural networks have a longer history. In 1989, Yann LeCun and others began to apply the standard backpropagation algorithm proposed in 1974 to deep neural networks, which were used to recognize handwritten postal codes.Although the algorithm could be successfully executed, the computational cost was so high that the training time of the neural network reached 3 days, making it impossible to put into practical use.

Many factors contributed to this slow training process, one of which was due to the vanishing gradient problem proposed in 1991 by Sepp Hochwright, a student of Ergen Schmidhuber.

The earliest deep learning network for natural object recognition in general natural cluttered images is the growth network (Cresceptron) published by Juyang Weng et al. in 1991 and 1992.

It was also the first to propose a method that was widely used in many experiments: it is now called max-pooling to deal with problems such as deformation of large objects.

The growth network not only directly learns the general objects specified by the teacher from the messy natural scene, but also uses the network reverse analysis method to segment the recognized objects in the image from the background image.

Around 2007, Jeffrey Hinton and Ruslan Salakhutdinov proposed an algorithm for efficient training in feed-forward neural networks.This algorithm treats each layer in the network as an unsupervised Restricted Boltzmann Machine, which is then tuned using a supervised backpropagation algorithm.

Prior to this in 1992, in a more general situation, Schmidhuber also proposed a similar training method on the recurrent neural network, and proved in experiments that this training method can effectively improve supervised learning. execution speed.

Since the advent of deep learning, it has become a part of various leading systems in many fields, especially in computer vision and speech recognition.In general-purpose data sets for testing, such as TIMIT in speech recognition and ImageNet in image recognition, experiments on Cifar10 prove that deep learning can improve the accuracy of recognition.At the same time, neural networks have been challenged by other simpler classification models, and models such as support vector machines became popular machine learning algorithms in the 20s and early 90s.

Advances in hardware are also an important factor in the renewed interest in deep learning.The emergence of high-performance graphics processors has greatly improved the speed of numerical and matrix operations, resulting in a significant reduction in the running time of machine learning algorithms.

Since a large number of studies in brain science have shown that the human brain network is not a cascade structure, the deep learning network is gradually being replaced by a more potential brain model-based network after 2001.

The basis of deep learning is distributed representation in machine learning.Dispersion representation assumes that observations are generated by the interaction of different factors.On this basis, deep learning further assumes that this interactive process can be divided into multiple levels, representing multiple levels of abstraction of observations.Different number of layers and size of layers can be used for different levels of abstraction.

Deep learning uses this idea of ​​hierarchical abstraction, where higher-level concepts are learned from lower-level concepts.This hierarchical structure is often built layer by layer using a greedy algorithm, from which more effective features that contribute to machine learning are selected.

Many deep learning algorithms appear in the form of unsupervised learning, so these algorithms can be applied to unlabeled data that other algorithms cannot reach. This type of data is more abundant and easier to obtain than labeled data.This also gives deep learning an important advantage.

Some of the most successful deep learning methods involve the use of artificial neural networks.Artificial neural networks were inspired by a theory developed in 1959 by Nobel laureates David H. Hubel and Torsten Wiesel.Huber and Wiesel found that there are two types of cells in the primary visual cortex of the brain: simple cells and complex cells, which are responsible for different levels of visual perception functions.Inspired by this, many neural network models are also designed as hierarchical models among different nodes.

Kunihiko Fukushima's new cognitive machine incorporates convolutional neural networks trained using unsupervised learning.Jan LeCun applied the supervised backpropagation algorithm to this architecture.

In fact, since the backpropagation algorithm was proposed in the 20s, many researchers have tried to apply it to the training of supervised deep neural networks, but most of the initial attempts failed.In his doctoral thesis, Sepp Hochwright attributed the failure to vanishing gradients, a phenomenon that occurs in both deep feedforward neural networks and recurrent neural networks, which are trained like deep nets.In the process of layered training, the error that should be used to correct the model parameters decreases exponentially with the increase of the number of layers, which leads to the low efficiency of model training.

To solve this problem, researchers have proposed some different methods.Jurgen Schmidhuber proposed a multi-level network in 1992, using unsupervised learning to train each layer of a deep neural network, and then using the backpropagation algorithm for tuning.In this model, each layer in the neural network represents a compressed representation of the observed variable, which is also passed on to the next layer of the network.

Another approach is the long short-term memory neural network (LSTM) proposed by Sepp Hockreiter and Jürgen Schmidhuber.

In 2009, in the continuous handwriting recognition competition held by ICDAR 2009, without any prior knowledge, the deep multi-dimensional long-term short-term memory neural network won three of the competitions.

Sven Baker proposed a neural abstraction pyramid model that only relies on gradient symbols during training to solve the problems of image reconstruction and face localization.

Other methods also use unsupervised pre-training to build neural networks to discover effective features, and then use supervised backpropagation to distinguish labeled data.Deep models proposed by Geoffrey Hinton et al. in 2006 propose a method for learning high-level representations using multiple layers of latent variables.This method uses the restricted Boltzmann machine proposed by Smolenski in 1986 to model each layer containing high-level features.The model ensures that the lower bound of the log likelihood of the data increases as the number of layers increases.When enough layers are learned, this deep structure becomes a generative model that can reconstruct the entire dataset through top-down sampling.Hinton claimed that this model can effectively extract features on high-dimensional structured data.

A Google Brain team led by Andrew Ng and Jeff Dean created a neural network that learned high-level concepts, such as cats, from YouTube videos alone.

Other methods rely on the massive computing power of modern electronic computers, especially GPUs. In 2010, in Jürgen Schmidhuber's research group at the Swiss artificial intelligence laboratory IDSIA, Dan Ciresan and his colleagues demonstrated that the GPU can be used to directly execute the backpropagation algorithm, ignoring the gradient The existence of the disappearing problem.This method defeated other existing methods on the handwriting recognition MNIST dataset given by Jan LeCun et al.

As of 2011, the latest approach in deep learning with feed-forward neural networks is to use alternately convolutional layers and max-pooling layers and add a pure classification layer as the top.The training process also does not require the introduction of unsupervised pre-training.Since 2011, GPU implementations of this method have won several pattern recognition competitions, including the IJCNN 2011 traffic sign recognition competition and others.

These deep learning algorithms are also the first to achieve parity with human performance on certain recognition tasks.

A deep neural network is a neural network with at least one hidden layer.Similar to shallow neural networks, deep neural networks can also provide modeling for complex nonlinear systems, but the extra layers provide a higher level of abstraction for the model, thus improving the ability of the model.Deep neural networks are usually feed-forward neural networks, but there are also studies in language modeling and other aspects that extend them to recurrent neural networks.Convolutional Neural Networks (CNN) have been successfully applied in the field of computer vision.Since then, convolutional neural networks have also been used as auditory models in the field of automatic speech recognition, achieving better results than previous methods.

Similar to other neural network models, deep neural networks may have many problems if they are simply trained.Two common types of problems are overfitting and excessive computation time.

Deep neural networks are prone to overfitting because the added layers of abstraction allow the model to model dependencies that are rarer in the training data.In this regard, methods such as weight reduction or sparseness can be used in the training process to reduce overfitting.

Another regularization method used later in deep neural network training is "dropout" regularization, which randomly discards a part of hidden layer units during training to avoid modeling rarer dependencies.

(End of this chapter)

Tap the screen to use advanced tools Tip: You can use left and right keyboard keys to browse between chapters.

You'll Also Like