| Developers: | |
| Last Release Date: | October 2018 |
| Branches: | Internet services |
Content |
CAPTCHA - with the English. Fully Automated Public Turing test to tell Computers and Humans Apart is a fully automated public Turing test for distinguishing between computers and people. The main idea is to offer the user a task that is easily solved by a person, but is extremely difficult and laborious for a computer. "Capcha" was developed at Carnegie Mellon University and was subsequently continued in a project called reCAPTCHA. In 2009, reCAPTCHA was acquired by Google.
2018
The fastest and most accurate algorithm for cracking CAPTCHA has been created
At the end of December 2018, it became known about the development of the fastest and most accurate machine learning algorithm that can hack CAPTCHA text systems.
This new algorithm, developed by scientists from the University of Lancaster (UK), Northwestern University (China) and Peking University (China), is based on the concept of GAN. This is a special class of AI algorithms that are used in cases of inaccessibility of a large amount of information. Classifying machine learning algorithms typically require huge databases for training, but GAN uses a so-called "generative" approach to create similar information based on the available. This "generated" data is then analyzed by a common algorithm.
The scientists applied the concept to hack text CAPTCHAs, which have been evaluated in the vast majority of previous studies only in terms of classical machine learning algorithms. In a real-world scenario, the researchers argue, an attacker will not be able to generate millions of CAPTCHAs on a real-world website or API without detection. Therefore, for their study, they used only 500 text CAPTCHAs from each text service.
The resulting algorithm was able to quickly and accurately recognize the text, and this approach turned out to be more efficient and cheaper than any other available systems. According to the researchers, the algorithm is able to crack any CAPTCHA text within 0.05 seconds using a regular PC. That means attackers won't have to buy and pay for expensive cloud computing servers to hack text CAPTCHAs in real time on websites.
The researchers recommend that website owners implement alternative bot detection measures that use multiple layers of security, such as comparison of usage patterns, device location or biometric data.[1]
reCAPTCHA v3 output
At the end of October 2018, Google introduced a new version of the reCAPTCHA robot protection system for sites. The technology is interesting in that it does not require special actions from the user, such as typing or choosing pictures that confirm that he is a person, not a robot.
| Over the past ten years, reCAPTCHA has been constantly improving its technology. In short, reCAPTCHA helps protect your sites without inconvenience to users and gives you more opportunities to decide what to do in risk situations , "said Wei Liu, Google Product Management Manager. |
reCAPTCHA analyzes the actions of site visitors, including mouse movement and page interactions, and distinguishes robots from real people.
The site administrator can set a threshold level at which the system will require the user to pass additional verification. To do this, they add action labels to the pages or their sections. One of the options for such actions may be to require telephone verification, in response to an attempt to leave a comment, access a user profile or transaction history.
reCAPTCHA v3 also takes into account signals across multiple pages rather than one, as was the case in the first and second versions of the technology. reCAPTCHA v3 is already available for installation on sites.[2]
2017: Recursive neural network managed to hack capcha
Scientists from the American company Vicarious have created an algorithm that decrypts capcha - the most common way to distinguish a person from a robot. Such an algorithm works on the basis of computer vision and a recursive cortical neural network and, according to the developers, can decipher capcha on many popular Internet platforms, including PayPal and Yahoo. The work is published in the journal Science.[3]
CAPTCHA
Capcha (CAPTCHA, stands for Completely Automated Public Turing test to tell Computers and Humans Apart - a fully automated public Turing test that allows you to distinguish a person from a robot) is used to find out who is trying to use any service: a person or some program to automate actions on the Internet. Capcha is usually based on the task of, for example, distinguishing between "floating" letters, highlighting a word from the background, or marking photos that contain a specific object. To solve it, a person has enough knowledge about the world around him and basic skills (for example, reading). A computer, however, requires a huge amount of data to perform such a test. He can recognize any standard characters, but, for example, "floating" letters that are found for the first time - with difficulty. On the other hand, for a person, such a task does not pose a big problem; artificial intelligence, accordingly, must be maximally developed (compared to real, human intelligence) to solve it.
Recursive neural network
Scientists at Vicarious have been able to develop a neural network for decoding capcha, called the recursive cortical network (RCN). To create it, knowledge was used about the processing of visual information by a person, namely, about the effective separation of the object and background, even when they have a very similar structure. The created neural network is able to highlight the outline of the object (for example, letters) against the general background, even if part of the object is hidden after another.
Only about 26 thousand images were used to train the neural network. For comparison, the convolutional neural network (CNN) -based capch recognition algorithm requires several million.
To check the operation of the neural network, data from the open Google capch generator reCAPTCHA was used, the peculiarity of which, according to the developers, is their comparative ease of recognition for people and complexity for computers. In addition, Yahoo, PayPal and Botdetect captcha were used for verification.
Results of neural network testing
Capcha is considered solved if the computer managed to recognize it in one percent of cases. The neural network created by Vicarious was able to decrypt examples from reCAPTCHA with an accuracy of 66.6%. For comparison, a person can recognize the same combinations with an accuracy of 87%.
Examples of captch used for training and the efficiency of the neural network at the level of words (third column) and letters (fourth column)
The algorithm also showed better (compared to other algorithms, the work of which is based on convolutional neural networks) efficiency in recognizing individual characters: up to 94.3%. For comparison, the efficiency of the convolutional neural network significantly decreases with increasing visual differences between the training and training samples.
Efficiency of individual character recognition by a recursive neural network, or RCN, and a convolutional neural network, or CNN. y-axis - fraction of data difference from training and training samples
In general, the effective operation of the presented algorithm raises the question of the need to improve existing cybersecurity solutions and develop tools to protect user data from artificial intelligence.



