Case Study: DataSeers
Founded in 2017, DataSeers is a B2B SaaS company with a mission to address the challenges experienced by the banking and payments industry. Instead of updating legacy banking software built for the age of checks and wire transfers, DataSeers reimagined back-office software from the perspective of ACH payments and prepaid cards. More specifically, DataSeers looked into the key pain points of fraud, compliance, and reconciliation, eventually broadening this to include KYC/KYB and onboarding.
Adwait Joshi, CEO and founder of DataSeers, was intimately involved with the development of his company’s big data platform. Joshi and his team evaluated several different options before selecting HPCC Systems. Joshi cited several factors that informed DataSeers’ decision to go with HPCC Systems: the open source license, ease of setup, support for commodity hardware, the ECL programming language, its flexibility, and its support for multiple data formats.
Joshi said there was a bit of a learning curve when it came to his team learning how to code in ECL, but it was well worth their time. “ECL is a very powerful language for data analysis. After a few days working with ECL, the team quickly understood that if they could master about six key processes, there was no program or application they couldn’t develop. ECL is also helpful in that the entire HPCC Systems stack uses ECL. This means it only takes one DataSeers programmer to get a complete HPCC Systems environment up and running for a client. This helps keep our headcount low and profit margins high.”
Joshi praised HPCC Systems flexibility and compatibility with other big data technologies. “We use other technologies besides HPCC Systems in our big data environment, like Elasticsearch, for example. But they all work well with HPCC Systems. And if there’s a particular application or feature not readily available in HPCC Systems, I still have the option of leveraging solutions developed in C++ or Python. I can embed their code into ECL to easily integrate them into our HPCC Systems environment." Finally, Joshi credited HPCC Systems ability to manage multiple data formats as a deciding factor in their choice of HPCC Systems.
Finally, Joshi credited HPCC Systems ability to manage multiple data formats as a deciding factor in their choice of HPCC Systems. “With HPCC Systems support for structured and unstructured data, we can ingest and format data of any type and use it to help inform our analysis. CSV, XML, JSON, plain text, or binary files? Doesn’t matter. HPCC Systems Thor cluster easily ingests, formats, and enriches all of it, regardless of file type.”
Setting up our first HPCC Systems cluster took very little time; I believe it was up and running in a matter of hours. We were also thrilled to discover we could run HPCC Systems on commodity hardware, which also helped our budget. And now that more of our customers are using cloud and Kubernetes, we’re happy to report to our clients looking to leverage the cloud that HPCC Systems runs very well on bare metal servers.
When we were first starting up as a company, our budgets were very tight. TCO was a critical part of our decision making process around which big data platform we chose. The fact that HPCC Systems was available for free through its open source license made it very compelling to us.