The name of the base system (platform): | Amazon Web Services (AWS) |
Developers: | Amazon |
Technology: | IaaS is Infrastructure as service |
Users of Amazon Web Services will be able soon to create the pipelines of problems of data processing including different services AWS and local resources using the new mechanism of the orchestration — AWS Data Pipeline.
Service is available to beta testing to limited number of participants. As explain in Amazon, Data Pipeline allows to automate the movement and processing of any amounts of data with check of dependences. For example, it is possible to create the pipeline on which once a day transaction journals of a copy of AWS EC2 on service of storage AWS S3 will move ahead, and the analysis of the saved-up data on a cluster of AWS Elastic MapReduce will be made weekly.
For creation of the pipeline the user specifies data sources and appoints processing transactions, a destination point and the schedule of accomplishment. It is also possible to set conditions which service should check before start of a task, for example, existence of the file which is subject to processing. Services Amazon EC2, Elastic MapReduce and also local resources of the user can participate in pipelines. Pipelines can be created in AWS Management Console or by writing of scripts.