The Wayback Machine - https://web.archive.org/web/20200811164635/https://github.com/topics/etl-pipeline

#

etl-pipeline

Here are 252 public repositories matching this topic...

benthos

Jeffail / benthos

Star

A stream processor for mundane tasks written in Go

go golang kafka cqrs etl rabbitmq amqp logs message-bus event-sourcing nats stream-processing message-queue streaming-data stream-processor etl-pipeline

Updated Aug 10, 2020
Go

InterestingLab / waterdrop

Star

生产环境的海量数据计算产品，文档地址：

java spark hadoop spark-streaming flink sql-engine etl-framework etl-pipeline

Updated Jul 22, 2020
Java

goodreads_etl_pipeline

san089 / goodreads_etl_pipeline

Star

An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.

Updated Mar 9, 2020
Python

AlexIoannides / pyspark-example-project

Star

Example project implementing best practices for PySpark ETL jobs and applications.

python data-science spark etl pyspark data-engineering etl-pipeline etl-job

Updated Jul 9, 2020
Python

Udacity-Data-Engineering-Projects

san089 / Udacity-Data-Engineering-Projects

Star

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

Updated Mar 5, 2020
Python

YotpoLtd / metorikku

Star

A simplified, lightweight ETL Framework based on Apache Spark

scala sql big-data spark etl distributed-computing etl-framework etl-pipeline

Updated Aug 11, 2020
Scala

roadrunnerlenny / etlbox

Star

A lightweight ETL (extract, transform, load) library and data integration toolbox for .NET.

etl csharp-core etl-framework etl-pipeline etl-jobs

Updated Aug 11, 2020
C#

techascent / tech.ml.dataset

Star

Clojure dataframe library and pipeline for data processing and machine learning

machine-learning clojure csv xlsx datascience dataset dataframe etl-pipeline

Updated Aug 11, 2020
Clojure

usc-isi-i2 / dig-etl-engine

Star

Download DIG to run on your laptop or server.

search-engine crawling information-extraction information-visualization etl-framework etl-pipeline

Updated Jan 9, 2019

maxim2266 / csvplus

Star

csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.

go csv etl stream-processing fluent-interface csv-format go-csv etl-framework etl-pipeline

Updated Mar 16, 2020
Go

setl

SETL-Developers / setl

Star

A simple Spark-powered ETL framework that just works 🍺

data-science machine-learning framework scala big-data spark pipeline etl data-transformation data-engineering dataset data-analysis modularization setl etl-pipeline

Updated Aug 6, 2020
Scala

sundios / SEO-Dashboard

Star

SEO dashboard from Search console Data using the Google Search API, Mysql database , NodeJS RESTAPI( ExpressJS) and reactJs Dashboard

react mysql dashboard rest-api seo expressjs seotools node-js seo-monitor google-search-console etl-pipeline etl-kpi google-search-console-python

Updated Jul 7, 2020
JavaScript

AzureDataFactoryHOL

Mmodarre / AzureDataFactoryHOL

Star

Azure Data Factory Hands On Lab - Step by Step - A Comprehensive Azure Data Factory and Mapping Data Flow step by step tutorial

azure azure-data-factory hands-on-lab azure-key-vault etl-pipeline adf-pipeline filter-activity lookup-activity foreach-activity metadata-activity mapping-dataflows hands-on-azure-data-factory azure-data-factory-tutorial azure-modern-data-warehous web-activity foreach-loop-activity

Updated May 27, 2020

DaFlow

sparsecode / DaFlow

Star

Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.

json scala csv apache-spark hive hadoop avro etl parquet transformation-rules etl-framework etl-pipeline join-data

Updated May 14, 2020
Scala

jira-database-etl

toddbirchard / jira-database-etl

Star

🚹

💾 Script to import issues from a JIRA instance into a database.

flask etl pandas python3 jira-rest-api flask-sqlalchemy etl-pipeline

Updated Jul 28, 2020
Python

tharwaninitin / etlflow

Star

Functional, Composable library in Scala based on ZIO for writing ETL jobs in AWS and GCP https://tharwaninitin.github.io/etlflow/site/

bigquery aws scala spark etl gcp zio etl-framework etl-pipeline

Updated Aug 11, 2020
Scala

xushiyan / kafka-connect-datagen

Star

A Kafka Connect source connector that generates data for tests

java kafka etl kafka-connect data-generator performance-test integration-test etl-pipeline

Updated Jun 26, 2019
Java

visiologyofficial / vixtract

Star

etl etl-framework etl-pipeline etl-components etl-job etl-automation

Updated Jul 31, 2020
HTML

BBVA / data-refinery

Star

Data transformation

data-science data machine-learning etl datascience etl-pipeline

Updated Apr 9, 2018
Python

mdh266 / AirflowETL

Star

Blog post on ETL pipelines with Airflow

python airflow sql database schedule etl postgresql data-engineering data-pipeline etl-pipeline

Updated Jun 7, 2020
Jupyter Notebook

jjasghar / COBOL-on-k8s

Star

Running an ETL pipeline with COBOL on Kubernetes

kubernetes yaml s3-bucket cobol etl-pipeline

Updated Jul 16, 2020
Shell

cyber-drop / ethereum_analytical_db

Star

Ethereum Analytical Database - Ethereum data access solution that can be used for analytics and application development. The solution works on a fast DB - Clickhouse.

api etl clickhouse ethereum blockchain eth dex erc20 erc223 etl-pipeline erc721 ethereum-etl

Updated Oct 30, 2019
HTML

codingforentrepreneurs / Serverless-Python-Workflow-with-AWS-Lambda

Star

A tutorial to setup and deploy a simple Serverless Python workflow with REST API endpoints in AWS Lambda.

python aws data-science aws-lambda serverless etl webscraping etl-pipeline

Updated Apr 22, 2020
Python

InterestingLab / waterdrop-example

Star

Waterdrop Plugin developing examples.

spark spark-streaming flink sql-engine etl-framework waterdrop etl-pipeline

Updated Jun 11, 2020
Scala

DAppBoard / dappboard-etl

Star

ETL pipeline for the Ethereum blockchain

javascript etl blockchain ethereum-blockchain etl-pipeline

Updated Feb 13, 2019
JavaScript

sanjeevai / disaster-response-pipeline

Star

ETL pipeline combined with supervised learning and grid search to classify text messages sent during a disaster event

sqlite-database supervised-learning grid-search-hyperparameters etl-pipeline data-engineering-pipeline disaster-event

Updated Feb 24, 2019
Python

amanjpro / greenish

Star

Data monitoring tool, monitors the result, not the run

data etl monitoring-tool etl-pipeline etl-jobs

Updated Aug 11, 2020
Scala

vertica / PSTL

Star

Parallel Streaming Transformation Loader

data-science data-mining hadoop bigdata ingestion realtime-messaging vertica streaming-data etl-pipeline

Updated Apr 23, 2019
Java

aws-samples / amazon-sagemaker-predict-accessibility

Star

Build end-to-end Machine Learning pipeline to predict accessibility of playgrounds in NYC

serverless athena glue autopilot etl-pipeline sagemaker sagemaker-deployment ml-engineering etl-solutions

Updated Jul 9, 2020
Jupyter Notebook

sungchun12 / serverless-data-pipeline-gcp

Star

🏭 Schedule a data pipeline in Google Cloud using cloud function, BigQuery, cloud storage, cloud scheduler, stack trace, cloud build, and pub/sub

bigquery sql python3 google-cloud-platform cloud-build cloud-functions etl-pipeline bigquery-schema cicd-promote-to-production cloud-scheduler stackdriver-trace

Updated Jun 4, 2019
Python

Improve this page

Add a description, image, and links to the etl-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the etl-pipeline topic, visit your repo's landing page and select "manage topics."

You can’t perform that action at this time.