Data team harmony lies with open source

8th May 2017
Source: Dataiku
Posted By : Enaie Azambuja

Harmony and analytics are two terms not often found together, especially when getting a team of diverse data professionals to execute data science chores - together. A data science team is often made up of people from diverse backgrounds, with diverse skillsets - from the machine learning specialist, to the master Python coder, to the beginning data analyst. To successfully build and execute any sized data science project requires harmony across all of the team members.

Everyone needs to work effectively and efficiently, using the tools they know best. What’s more, the growing deficit of Data Scientists, along with the closed nature of many analytics tools makes building effective teams even more difficult. That said, all is not lost.

There exists a vast ecosystem of open source tools that are available to the masses, which can help to level the playing field, and bring data analytics capabilities to professionals of all stripes. Yet, much like the cola wars of the 80’s, there is an almost infinite variety of flavors and formulas that drive tastes, at least when it comes to standardising analytics tool sets.

This is a conundrum that can only be solved by creating harmony among team members and their tools of choice. However, harmony means many things to many people, in the case of data science, harmony takes on the form of people being able to interact with their tools of choice, as well as having some mechanism to orchestrate those tools.

Naturally, orchestration and harmony cannot happen without a conductor, and in the world of data science, that conductor takes the form of a software platform that fuels interoperability, and tears down barriers.

Dataiku digitises that conductor with Dataiku Data Science Studio (DSS), a platform that embraces the ideologies of open source technologies, and bridges those technologies together to give teams choices, while promoting collaboration. Dataiku DSS connects to more than 25 different data storage systems, including closed source and open source databases, such as SQL Server, HDFS, NoSQL, and so forth.

Dataiku DSS also supports numerous programing languages (Python, R, Spark, etc) allowing data professionals to work with the programming tools of their choice and still have connectivity to the data shared by the team. Critical features such as team knowledge sharing, change management, and project monitoring further fuel collaboration, while eliminating silos of operation.

Dataiku DSS’s platform approach centralises open source elements, creating an environment where team knowledge is shared, and never lost when teams are reconfigured. What’s more, integrated to-do lists, document sharing, and unified logs make it easier to onboard new team members, as well as perform forensics on previous projects.

Simply put, open source tools may become the lifeblood of enterprise data science projects, but without proper orchestration, that life blood, as well as communal knowledge is sure to be lost over a short period of time.


You must be logged in to comment

Write a comment

No comments




Sign up to view our publications

Sign up

Sign up to view our downloads

Sign up

Girls in Tech | Catalyst | 2019
4th September 2019
United Kingdom The Brewery, London
DSEI 2019
10th September 2019
United Kingdom EXCEL, London
EMO Hannover 2019
16th September 2019
Germany Hannover
Women in Tech Festival 2019
17th September 2019
United Kingdom The Brewery, London
European Microwave Week 2019
29th September 2019
France Porte De Versailles Paris