Sign-up to Access Hundreds of EGG Talks for Free

Architecting Scalable Self-Service Analytics Platforms


How do you create a data environment for self-service analytics, data science and machine learning that can handle petabytes of data and hundreds of concurrent users? A massively scalable data analytics platform such as Pivotal Greenplum makes cleansed, collated data at scale available to your Dataiku users. In this talk, we provide an overview and demo of how you can rapidly process and query large data sets in Dataiku taking advantage of Pivotal Greenplum and in-database analytics functions. We’ll show how to query across diverse datasets, how to prepare data, train machine learning algorithms, work with geospatial data, and conduct text analysis — all executing within the Pivotal Greenplum data warehouse. We’ll also touch how to enforce data governance and access control.

headshot-egg (1)

Shailesh Doshi
Senior Data Engineer, Pivotal Software

Shailesh Doshi is a data engineer with Pivotal Software who helps make customers successful through his background and experience in all things data and data science. What gives Shailesh job satisfaction is helping customers transform businesses into modern data driven organizations specifically around cloud and data strategy with data driven cloud native application transformation.



Go to Top