Translated by
2019/10/14 19:50:03

Machine learning of Machine Learning

The highly specialized knowledge domain which is a part of the main sources of the technologies and methods applied in areas of Big Data and Internet of Things which studies and develops algorithms of the automated extraction of knowledge from a crude data set, training of program systems on the basis of data retrieveds, generation of the forecast and/or ordering recommendations, image identification, etc.


Machine learning — a class of methods of artificial intelligence which characteristic is not the direct solution of a task, but training in the course of award enforcement of a set of similar tasks. For creation of such methods means of mathematical statistics, numerical methods, optimization methods, probability theory, graph theories, different technicians of work with data in a digital format are used. According to HeadHunter (data of 2018), machine learning specialists receive 130-300 thousand rubles, and the large companies conduct fierce fight for them.


Honor growth of number of AI/ML projects in Russia by 3 times for 2 years and 9 months

On October 14, 2019 the company "Jet Infosystems" reported that it analyzed more than 360 AI/ ML- the projects since the beginning of 2017 on September, 2019 implemented century Russia. The research showed growth more than three times.

Analysts Jet Infosystems note that in 2018 there was explosive growth of popularity of projects on machine learning (ML). 2.7 projects 2018 are the share of one project of 2017.

In 2019 the number of projects continued to increase relatively 2018 (approximately by 10%), their structure nevertheless cardinally changed. If in 2017 it were point projects of the IT companies, then in 2019 – the artificial intelligence became completely working technology which is applied in many industries. Besides, test (pilot) projects there were much less rather similar indicators of 2018.

As for industry application, leadership still belongs to the bank industry (20%) and retail (20%) where there are enough data, the high competition and there is a budget on implementations. In 2019 AI technologies came and in the industry – this sphere possesses every 14th project.

At top five there are companies aggregators (for example, Yandex) which offer services of mail, transfer, transport services, etc. and also promotion and travel agencies.

According to the conducted research, not only large business implements AI: the number of projects grows in the small enterprises the third year in a row. Actively Internet services, online stores, medium-sized industrial production, small regional transport companies, regional divisions of federal state institutions, etc. implement digital technologies.

Software 2.0: As the new development approach of software will force computers to grow wiser

Paradigm of Software 2.0 - a software development approach which is capable to make high-quality breakthrough in the field of computing development. As the purpose of Software 2.0 serves creation of model which can generate codes, it studies what codes in compliance with the set rules should be created for obtaining these or those results. Read more here.

IBM started the portal with free data sets for machine learning in the companies

On July 16, 2019 IBM started the portal with free data sets for machine learning in the company. The company calls IBM Data Asset eXchange (DAX) the unique project for corporate clients, despite presence at the Internet (for example, on GitHub) a large number of open data arrays. Read more here.

10 best programming languages for machine learning — GitHub

In January, 2019 service for a hosting of IT projects and their joint development of GitHub published the rating of the most popular programming languages used for the machine learning (ML). The list is made on the basis of quantity of repositories which authors specify that in their applications MO-algorithms are used. Read more here.

2018: Problems of machine learning there is IBM

On February 27, 2018 the technical director of IBM Watson Rob High said that now the main objective of machine learning  – to limit amount of data, the neuronets which are required for training. High believes that there are all bases to consider this problem quite solvable. His opinion is shared also by colleagues: so the head of development of artificial intelligence technologies (AI) of Google John Giannandrea noticed that his company is also busy with this problem.

As a rule, models of machine learning work with huge data arrays to guarantee the accuracy of work of a neuronet, however in many industries of large databases just does not exist.

IBM told about problems with machine learning

Huy, however, considers that it is a problem is solvable, the brain of people learned to cope with it. When the person faces a new task, the accumulated experience of actions in similar situations is used. Huy also suggests to use contextual thinking. Also the technology of transfer of training (transfer learning), i.e. an opportunity to take already trained AI model can help with it and to use its data for training of other neuronet for which it is significantly less data.

However problems with this machine learning are not limited, especially if it is about the natural speech.

We tries to understand how to teach AI to interact with people, without arousing mistrust how to affect their thinking, - Huy explained. – At communication people perceive not only information, but also gestures, a mimicry, intonation, modulations of a voice.

Huy notes that AI shall not reflect these aspects in an anthropomorphous form, however some response signals, for example, visual, should arrive. At the same time the majority of AI should understand for a start an essence of questions and learn to be guided in a context, especially in how the matter is connected with previous.

It indicates the following problem. Many of the models of machine learning which are used now inherently are prejudiced as data according to which they were trained are limited. As for similar bias then Huy selects two aspects.

First, data really can be collected incorrectly, and those who is engaged in their selection for the systems  of machine learning should monitor that in them the interests of all occupation and demographic layers were considered more attentively, - Huy commented. - On the other hand, sometimes data are picked consciously up so that to reflect only certain aspect of a problem or a certain selection as the task is so set.

Huy gave the joint project of IBM and the oncological center Sloan Kettering as an example. They prepared the AI algorithm based on work of the best oncological surgeons.

However doctors of the oncological center Sloan Kettering adhere to a certain approach to cancer therapy. It is their school, their brand, and this philosophy should be reflected in AI created for them and is saved in all subsequent its generations which will extend outside this oncology center. The most part of efforts during creation of such systems is directed to providing right selectivity of data. Selection of people and their data should reflect larger cultural group to which they belong.

Huy also noticed that representatives of IBM at last began to discuss these problems with clients regularly. According to Huy, it is a step in a right direction, especially if to consider that his many colleagues prefer to ignore this question.

Concerns concerning bias of AI are shared also by Dzhannandrea. Last fall he said that he is afraid not of a revolt of reasonable robots, but bias of artificial intelligence. This problem that becomes more considerable, than more technology gets into such areas as medicine or law and what more people without technical education begin to use it.[1]


3% of the companies use machine learning — ServiceNow

In October, 2017 the producer of cloud solutions for business process automation of ServiceNow published results of the research devoted to implementation of machine learning technologies in the companies. Together with the research   center Oxford Economics 500 Chief information officers in 11 countries were polled.

It became clear that by October, 2017 89% of the companies which staff answered questions of analysts in different degree use mechanisms of machine learning.

So, 40% of the organizations and the enterprises investigate opportunities and plan stages of implementation of such technologies. 26% of the companies conduct pilot projects, 20% — apply machine learning to certain areas of business, and 3% — use it for all the activity.

According to 53% of Chief information officers, machine learning is the key and priority direction for which development the companies look for the corresponding specialists.

By October, 2017 the highest penetration of machine learning takes place in North America: 72% of the companies are at any stage of studying, testing or use of technologies. In Asia this indicator is 61%, in Europe — 58%.

About 90% of Chief information officers say that automation increases the accuracy and speed of decision making. According to more than a half (52%) of survey participants, machine learning helps to automate not only routine tasks (for example, an output of warnings of cyberthreats), but also more difficult workloads, such as methods of response to the hacker attacks.

The chart showing extent of automation of different areas in the companies in 2017 and with the forecast for 2020 is given above. For example, in the 2017th about 24% of transactions in the field of information security completely or are substantially automated, and in 2020 the indicator can grow to 70%.

The most promising technology. What causes general insanity on machine learning?

Machine learning, according to analysts, is the most promising technology trend of the present. How did this technology and why became so demanded arise? On what principles machine learning is based? What prospects does offer for business? Answers to these questions are given by material which for TAdviser was prepared by the journalist Leonid Chernyak.

[1] [2] The keen interest in machine learning (Machine Learnng, ML) and numerous attempts of implementation of ML in the most different, occasionally unexpected areas of human activity is sign of the coming era of cognitive computing (see in more detail in the separate article).

To that the certificate - the "an agiotage curve" (Gartner's Hype Cycle) dated August, 2016. On it ML takes a position at peak of expectations. In the report of this analytical company it is emphasized that the current splash in interest in the Artificial Intelligence (AI) in general and ML, in particular, should be distinguished from not met expectations of the last decades which resulted in time oblivion of AI.

All that happens in 2016-2017 is more prosy and is pragmatic, deprived of romantic promises of rather anthropomorphous technologies imitating a human brain. There are no reasonings on the conceiving machines and the more so threats from robots. In the report of Gartner the expression of the vice president of IBM for researches John Kelly, "cynical" and obviously unacceptable for supporters of strong AI, is quoted:

Success computing of cognitive computing will not be measured by either the Turing test, or any other capability of the computer to imitate a human brain. It will be measured by such practical indicators as return of investments, new market opportunities, the number of the cured people and the saved human lives

"Agiotage curve" of Gartner, August, 2016

Kind of interest in ML was not big, it is incorrect to identify all cognitive computing (Cognitive Computing, CC) only with ML. Actually CC is a component of AI, a complete ecosystem as which part serves ML. Besides CC also still many other things include both automatic decision making, and recognition of audio and video of data, machine vision, processing of texts in natural languages.

However, it is difficult to carry out strict separation between separate directions CC. Some of them are mutually stopped, but that precisely, ML includes the mathematical algorithms supporting process of cognitive training.

Artificial intelligence (AI), cognitive computing (CC) and machine learning (ML)

ML is a training of the systems having elements of weak AI. Strong AI (Strong AI) call the generalized artificial intelligence (Artificial general intelligence) which can be theoretically embodied by some hypothetical machine showing the powers of thinking comparable with human capabilities.

Strong AI is allocated with such lines as:

  • capability feel (sentience),
  • capability take out judgments (sapience),
  • introspection (self-awareness) and even
  • consciousness (consciousness).

And Weak AI (Weak AI) call not having reason and mental capacities (Non-sentient computer intelligence), AI focused on the solution of applied tasks.

Being a part of weak AI, ML, nevertheless, has the common features with training of the person detected by psychologists at the beginning of the 20th century. Then several theoretically possible approaches to training as to process of knowledge transfer were revealed. And one of approaches called cognitive training directly corresponds to ML.

To the trainee, in our case of AI, these or those images in a form available to it are shown. For perception of the imparted knowledge from the trainee it is enough to have the corresponding capabilities and incentives. The basis of the theory of cognitive training was developed by the Swiss psychologist Jean Piaget (1896 – 1980). He, in turn, used works in the field of the geshtaltpsikhologiya developed by German later the American psychologist Wolfgang Köhler (1887 — 1967).

The theory of cognitive training is under construction on the basis of the assumption that the person has learning capability, has necessary incentives and can structure and save accumulated information. The same concerns ML. It can be considered the version of cognitive training, but adapted for the computer.

Jean Piaget

History ML, as well as many other things in artificial intelligence, began, apparently, with promising works in the 1950th — the 1960th years. Then the long accumulation period of knowledge known as "winter of artificial intelligence" followed. In the last years explosive interest mainly in one of the directions — to deep, or deep learning (deep learning) is observed.

Arthur Samuel, Joseph Veytsbaum and Frank Rosenblatt were pioneers of ML. The first became widely known creation in 1952 of the self-training Checkers-playing program able to play, as it appears from the name, in checkers. Perhaps, more significant for descendants was its participation together with Donald Knuth in the TeX project which result was a system of computer imposition, already nearly 40 years not having equal for preparation of mathematical texts. The second in 1966 wrote the virtual interlocutor of ELIZA capable to imitate (and rather, to parody) dialog with the psychotherapist. It is obvious that the program is obliged by the name to the heroine from the play by Bernard Shaw. And further all Rosenblatt went. He in the late fifties in Cornell University constructed the Mark I Perceptron system which it is conditionally possible to recognize the first neurocomputer.

In the sixtieth or seventieth 20th a century there were basic scientific principles ML. In modern representation of ML integrates in itself earlier independent directions:

  • neural networks (neural networks),
  • training in precedents (case-based learning),
  • genetic algorithms (genetic algorithms),
  • outputs of rules (rule induction) and
  • analytical training (analytic learning).

It was shown that practical knowledge transfer to the trained machine (neural network) can be based on the basis of the theory of computing training in precedents which develops from sixtieth years of the 20th century.

Informally ML can be provided as follows. Descriptions of separate precedents which call the training selection undertake. Further on set of separate fragments of data it is possible to reveal the general properties (the dependence, patterns, interrelations) inherent not only to this specific selection used for training but also in general all precedents, including those which were not observed yet. Algorithms of training (learning algorithm) and setup (fitting) of model on data sampling allow to find an optimal set of model parameters, and then to use the trained model for the solution of these or those applied tasks.

In general ML can be provided a formula:

Training = Representation + Assessment + Optimization


  • Representation - representation of the classified element on a formal language which the machine can interpret
  • Assessment - the function allowing to select bad and good qualifiers
  • Optimization – search of the best qualifiers

The main purpose ML – to create, for example, in neural network capability to detect something other, not entering the set used for training, but having the same properties.

Training includes image identification, regression analysis and forecasting. Most often use the approach based on creation of model of the recovered dependence in the form of parametrical family of algorithms. Its essence in numerical optimization of model parameters for the purpose of minimization number of errors on the set training selection of precedents.

Training consists in adjustment of the created model under selection. But this approach has a congenital weakness which is shown that with increase in complexity of model the algorithms optimizing model begin to catch not only lines of the recovered dependence, but also measurement errors of the training selection, and an error of the model. As a result the quality of work of an algorithm worsens.

Exit from this provision was offered by V.N. Vapnik and A.Ya. Chervonenkis in the renewal theory of dependences developed by them recognized around the world in the eighties and which became one of the most important sections of the theory of computing training.

The transition from the theory to practice of ML which happened in the 21st century was promoted by works in the field of deep neural networks (Deep Neural Network, DNN). It is considered that actually the term deep learning was offered in 1986 by Reena Dekhter though the true story of its emergence is probably more difficult.

To the middle of the 2000th the critical mass of knowledge in the field of DNN and, as always in such cases was saved up, someone comes off a peloton and receives the leader's undershirt. So was and, probably, will be in science always. In this case in a leader role there was Jeffrey Hinton, the British scientist who continued the career in Canada. Since 2006 he and together with colleagues began to publish the numerous articles devoted to DNN including in the popular scientific magazine Nature, than deserved to himself lifetime glory of the classic. Around it the strong and solid community which several years worked as now speak, "in the invisible mode" was formed. His members call themselves "conspirators of deep training" (Deep Learning Conspiracy) or even "the Canadian mafia" (Canadian maffia).

The leading trio was formed: Yang Liekong, Yahshuah Bendzho and Jeffrey Hinton. They are called still by LBH (LeCun & Bengio & Hinton). An exit of LBH from an underground was well prepared and supported by the companies Google, Facebook and Microsoft. Andrew Ng working in MTI and Berkeley, and now heading researches in the field of artificial intelligence in Baidu laboratory actively cooperated with LBH. It connected deep training with graphic processors.

Jeffrey Hinton

The current success of ML and universal recognition became possible thanks to three circumstances:

1. The number of data increasing in geometrical progression. It causes the need for data analysis and is a necessary condition for implementation of systems ML. At the same time this number of data opens an opportunity for training as generates a large number of samples (precedents), and this sufficient condition.

2. The necessary processor base was created. It is known that solving of tasks of ML breaks up to two phases. On the first training of an artificial neural network is executed (training). Throughout this stage it is necessary to process a large number of samples in parallel. At the moment for this purpose there is no alternative to the graphic processors GPU, in most cases use GPU Nvidia. For work of the trained neural network normal high-performance CPU processors can be used. This distribution of functions between types of processors can soon undergo significant changes. First, in 2017 Intel promises to put on the market the specialized Nervana processor which on will be about more productive, than GPU. Secondly, new types of programmable arrays of FPGA and the large dedicated circuits ASIC, and the specialized Google TensorFlow Processing Unit (TPU) processor appear.

3. Creation of libraries for the software of ML. As of their 2017 there are more than 50. Here only some, the most known: TensorFlow, Theano, Keras, Lasagne, Caffe, DSSTNE, Wolfram Mathematica. The list can be continued. Practically all of them support the applied OpenMP interface, the Pyton, Java and C languages ++ and the CUDA platform.

Future scope of ML, without any exaggeration, is vast. In the context of the Fourth industrial revolution the most significant role of ML consists in expansion of capacity of the Business Intelligence (BI) area which name is conditionally translated as "business intelligence".

In addition to traditional to a large extent quantitative for a BI question: "What occurs in business?", about the help of ML it will be possible to answer also such: "As why we do?", "As we can do it better?", "What we should do?" both similar qualitative and informative questions.

About machine learning on simple examples

What is the machine learning?

It is a method of programming at which the machine itself creates an algorithm on the basis of the model set by it the person and the data loaded into it.

Such approach differs from classical programming: when "training" in the program show many examples and learn to find in them patterns. Similarly people study — instead of the verbal description of a dog just show to the child a dog and say that it. If to such program to show, for example, one million photos of oncological educations on skin, she learns to diagnose cancer[2] according to the picture better, than the living[3].

Why training of models so difficult?

Provide that I train the machine, using group of people... and here the golden rule consists that they should be equally interested and acquainted with process so, say, I cannot take five programmers and four yesterday's students... It is necessary to try to select people or absolutely randomly, or for identical interests. There are two methods to make it. You show them much, it is a lot of pictures. You show them images of mountains alternately with photos of camels and also images of objects which practically in accuracy are similar to mountains, for example, ice cream in a wafer cup. And you ask them to tell that from these objects it is possible to call the mountain. At the same time the machine watches people and on the basis of their behavior with mountains it also begins to select mountains from selection process of images. Such approach is called heuristic, - the author of PCWeek[4] writes[4]

We look at people, we model their behavior by observation, and then we try to repeat what they do. It is a type of training. Such heuristic modeling represents one of methods of machine learning, however it is not the only way.

But there is a set of simple acceptances using which this system it is possible to deceive. A fine example — recognition of human faces. Look at persons of different people. Probably, all know that there are technologies for modeling on the basis of certain points on the person, say, of corners of eyes. I do not want to press in intellectual secrets, but there are some areas between which it is possible to construct corners, and these corners usually not especially change over time. But here show you pictures of people with widely open eyes or grimaces in a mouth. Such people try to confuse these algorithms, distorting lines of the person. That is why you cannot smile in the photo in the passport. But machine learning already left far forward. We have such means as Eigenface, and other technologies for modeling of turn and distortion of persons allowing to define that this same person.

Over time these tools become better. And sometimes, when people try to confuse learning process, we also study on their behavior. So this process spontaneous, and in this plan goes permanent progress. Sooner or later the objectives will be achieved, and yes, the machine will find only mountains. She will not pass any mountain and will never be confused by an ice cream glass.

What does it differ from classical programming in?

Initially this process happened in playful way or consisted in identification of images. Researchers of that time asked participants to play games or to help with training by simple statements it seems "It is the mountain", "It is not the mountain", "It is Mount Fuji", "It is Mount Kilimanjaro". So they saved up a mere verbiage. They had a group of the people using words for the description of images (for example, in the project of Amazon Mechanical Turk).

Using these techniques, they actually selected a mere verbiage and told: "So, the word "mountain' often is associated with so and so, and between a word "mountain' and this image is observed high statistical correlation. So if people look for information on mountains, show them this image. If they look for Mount Fuji, show them this image, but not that". In it acceptance of sharing of a human brain and descriptive words also consisted. As of 2017 this acceptance not only. At the moment there is a set of more sophisticated techniques.

Whether I will be able to apply machine learning in the business?

Machine learning has the high practical importance for many industries, from a public sector, transport and medicine before marketing, sales, finance and insurance. There is a huge number of methods of its application – for example, forecast service, optimization of a supply chain, fraud recognition, personalisation of health care, reduction of road traffic, rational planning of the schedule of flights and many others.

Public institutions use machine learning for intelligent data analysis for the purpose of increase in the efficiency and economy of money. Banks apply machine learning to identification of investment opportunities, high-risk clients or signs of cyberthreat. In the field of health care machine learning helps to use data of wearable devices and sensors for assessment of the state of health of the patient in real time.

Algorithms of machine learning

  • Linear and logistic regression
  • SVM
  • Decisive trees
  • Random forest
  • AdaBoost
  • Gradient busting
  • Neuronets
  • K-means
  • Expectation maximization algorithm
  • Autoregressions
  • Self-organizing maps

Harmful machine learning

Perspectives of development of a mathematical apparatus of AI or whether there is life outside ML/DL

Main article: Perspectives of development of a mathematical apparatus of AI or whether there is life outside ML/DL