RSS
Логотип
Баннер в шапке 1
Баннер в шапке 2

Platforma and HFLabs: Secure Data Tagging Technology

Product
Developers: Big Data Platform (Platforma), HFLabs, formerly HumanFactorLabs
Last Release Date: 2022/07/19
Technology: Data Mining,  MDM - Master Data Management

The main articles are:

2022: Testing Safe Throwing Technology

The developer of business solutions based on big data Platforma and - IT company HFLabs tested the technology of safe throwing data of two different players. This was announced on July 19, 2022 by Platforma. The parties have developed algorithm transformations and combinations that databases take into account the existing requirements of the law and allow you to find intersections without using. personal data This technology helps companies identify common customers and offer them joint programs, new loyalty products and services, as well as improve communication with users, increase conversion, etc.

The first participants in the pilot were VTB and Rostelecom. Using the algorithm, client databases were combined, containing a total of about 250 million records. The solution of Platforma and HFLabs made it possible to find groups of customers who use the services of both pilot participants, without the use and transfer of their personal data. This was achieved by working with synthetic identifiers - UUID, which are not personal data and consist of a random set of letters and numbers of fixed length, as well as a two-stage scheme for distributed data conversion using a secure "secret" key.

In preparation for work, the data is hashed in two stages using a session secret available only to data owners, and then transferred to a federated hub - the heart of IT architecture. It compares hashes and finds intersections in client databases. The exchange model supports several methods of hashing data, including the methods provided by GOST.

One of the key difficulties of the project is the different format of client data. Even within the same business, there are often various IT systems CRM(,, billing credit portfolios, etc.), where information it is stored in a different format with a different set of fields, characteristics. For example, in one system, the name is written as "Natalia," and in another - "Natalia." The joint solution of HFLabs and Platforma takes this feature into account, so first you search for similar data using deduplication mechanisms and algorithms that take into account synonyms, typos, interchangeable words, outdated names of settlements.

The second important criterion when tagging customers is speed. The first stage of hashing can take about two days. Then the two companies' base-throwing, including the second stage of hashing, can take several hours.

File:Aquote1.png
A specialized solution, finalized by a partner for our task, has proven its efficiency and effectiveness. It is applicable both for working with databases of individuals and legal entities. We intend to scale this solution and issue it as a full-fledged business product. Identifying common customers will allow various companies to develop new joint loyalty programs or special offers for users, deepen their knowledge and understanding of customers. At the same time, consumers will quickly gain access to new services: for example, the bank is more willing to issue a loan, knowing that the client regularly pays for cellular communications from the provider, "said Alexey Kashtanov, CEO of Platforma.
File:Aquote2.png

File:Aquote1.png
Our solution provides secure, fast and accurate identification of customers in the databases of different organizations. This is an important step towards creating federated ecosystems that unite different companies with equal rights. Business will be able not only to find common customers, but also to understand what goods and services they buy, - explained Konstantin Stepanov, executive director of the IT company HFLabs.
File:Aquote2.png

In the future, on the basis of the developed technology, Platforma will be able to act as a kind of data bank, where partners store their data in their own cells without access to them from other participants. At the same time, users of such a service will be able to safely combine, tag their databases, analyze and build mathematical models on combined data, create services and business products, and participate in monetization.