KibaRuby 的 ETL 框架
Kiba 是一个轻量级的 Ruby 的 ETL 框架。
作业定义 xxx.etl:
# declare a ruby method here, for quick reusable logic def parse_french_date(date) Date.strptime(date, '%d/%m/%Y') end # or better, include a ruby file which loads reusable assets # eg: commonly used sources / destinations / transforms, under unit-test require_relative 'common' # declare a pre-processor: a block called before the first row is read pre_process do # do something end # declare a source where to take data from (you implement it - see notes below) source MyCsvSource, 'input.csv' # declare a row transform to process a given field transform do |row| row[:birth_date] = parse_french_date(row[:birth_date]) # return to keep in the pipeline row end # declare another row transform, dismissing rows conditionally by returning nil transform do |row| row[:birth_date].year < 2000 ? row : nil end # declare a row transform as a class, which can be tested properly transform ComplianceCheckTransform, eula: 2015 # before declaring a definition, maybe you'll want to retrieve credentials config = YAML.load(IO.read('config.yml')) # declare a destination - like source, you implement it (see below) destination MyDatabaseDestination, config['my_database'] # declare a post-processor: a block called after all rows are successfully processed post_process do # do something end
执行作业:bundle exec kiba my-data-processing-script.etl
评论
BenetlPostgreSQL的ETL工具
Benetl是PostgreSQL数据库的一个免费的ETL工具,同时也支持MySQL。用于从包括csv、txt和excel文件中抽取数据进行转换并导入到数据库中。
BenetlPostgreSQL的ETL工具
0
OctopusJava的ETL工具
EnhydraOctopus是一个基于Java开发的数据ETL(抽取、转换和加载)工具,可以连接到兼容JDBC的数据库并根据XML定义文件对数据进行抽取和处理。结构图如下:
OctopusJava的ETL工具
0
Kettle开源 ETL 工具
Kettle是一款国外开源的ETL工具,纯Java编写,绿色无需安装,数据抽取高效稳定(数据迁移工具)。Kettle中有两种脚本文件,transformation和job,transformation
Kettle开源 ETL 工具
0
Palo ETL ServerETL工具
PaloETLServer是一个Java的工具用来对数据进行抽取、转换和加载到PaloOLAPServer中,该项目已经整合到PaloBISuite中,并且不再更新。
Palo ETL ServerETL工具
0