Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS Marketplace

3 AWS reviews

External reviews

354 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Stephen D.

Staging data for insights

  • May 26, 2021
  • Review provided by G2

What do you like best about the product?
It makes the power of Spark accessible and innovative solutions like Delta Lake.
What do you dislike about the product?
Fewer solutions that aren't wholly or partially on the cloud.
What problems is the product solving and how is that benefiting you?
We are staging large datasets for reporting and multiple BI solutions.


    Deepa Ram S.

Best tool for big data

  • May 20, 2021
  • Review provided by G2

What do you like best about the product?
Easy to use multiple languages based command in same notebook. Direct connection to Redshift.
What do you dislike about the product?
Sometime it takes lot of time to load data. Should show better suggestions.
What problems is the product solving and how is that benefiting you?
We are using databricks to analyse big data and get business insights.


    Pavan Kumar Y.

One stop shop for all your data problems

  • April 15, 2021
  • Review provided by G2

What do you like best about the product?
It has got everything in it. IDE, Version Control, Scheduling whatnot.
What do you dislike about the product?
I didn't find something that discomforts me yet.
What problems is the product solving and how is that benefiting you?
Currently, I'm using it as an ETL tool. It's easy to use and connects with any data source—excellent documentation and help from the community.
Recommendations to others considering the product:
Just go for it. You can do many things you want to do with your data.


    Prashidha K.

Very powerful yet easy to use distributed computing and data warehousing platform

  • January 25, 2021
  • Review verified by G2

What do you like best about the product?
Databricks had very powerful distributed computing built in with easy to deploy optimized clusters for spark computations. The notebooks with MLFlow integration makes it easy to use for Analytics and Data Science team yet the underlying APIs and CICD integrations make it very customizable for the Data Engineers to create complex automated data pipelines. Ability to store and query and manipulate massive Spark SQL tables with ACID in Delta Lake makes big data easily accessible to all in the organization.
What do you dislike about the product?
It lacks built in data backup features and ability to restrict data access to specific users. So if anyone accidentally deletes data from Delta Table or DBFS, the lost data cannot be retrieved unless we setup our own customized backup solution.
What problems is the product solving and how is that benefiting you?
I have worked with big data with hundreds of millions of rows using databricks. We do most of the ELT, data cleaning and prepping works on databricks. The ease and speed of querying bid data using databricks SparkSQL is very useful. It is also very easy to create prototype codes utilizing real sized data using the available Python and R notebooks.


    Chad F.

Reduced database network redistributions & run-time of key models by 99+%!

  • August 17, 2020
  • Review provided by G2

What do you like best about the product?
Incidentally, the thing I like most about Databricks isn't a product feature at all; I love Databricks's proactive and customer-centric service, always willing to make an exception or create a unique feature, all the while minimizing costs for the customer - as @Heather Akuiyibo & Shelby Ferson et al. have done for me and my former teams!
What do you dislike about the product?
Broadening programming logic and syntax.
What problems is the product solving and how is that benefiting you?
To name seven (7):

(1) User segmentation using a proprietary variation of a hierarchical DBSCAN clustering algorithm of high-dimensional data with novel distance [quasi] metric, based on hubness analysis;

(2) Leveraging the above in email targeting and invoking multi-armed bandit testing methodologies for email timing, frequency, and content, using decreasing-epsilon strategy;

(3) Modeling predicted underwriting criteria with a binary approval odds classification algorithm;

(4) Using a dynamic panel data, fixed effects model to predict the effect of changes in credit reports on user credit score;

(5) Employing an Autoregressive Integrated Moving Average (ARIMA) with optimized Akaike Information Criterion exploits to predict future revenue and growth (lagged results led to average error bounds of only 5 percent; cross-validation results were even stronger, though I was conservative in guaranteeing 7 percent error, on average);

(6) Refining a multiverse (context-aware) recommendation engine as an n-dimensional tensor (rather than the typical two-dimensional user-item matrix) for partner product recommendations, using High-Order Singular Value Decomposition to solve;

(7) Invoking a Convolutional Neural Network framework with a novel architecture and results of a Fourier Transform as input to classify dental x-rays and highlight to the dentist which teeth require fillings (after approximately two months, the model reached ~95 percent accuracy - in terms of actual agreement by dentists using the app - with F1 score in cross-validation performing on par).
Recommendations to others considering the product:
Be open to the pitch. You may think things are "going fine" or proffer the idea of "if it ain't broke, don't fix it," but these represent short-term thinking traps such that scaling becomes inherently and implicitly constrained and limited. Databricks amounts to the forward-thinking businessperson.


    ianthe L.

How I experienced databricks

  • August 17, 2020
  • Review provided by G2

What do you like best about the product?
It is great when you have large amount of data, excellent for collaboration, perfect for using with visualisation tools and functions with many programming languages.
What do you dislike about the product?
Difficult to get a grasp on how many applications and funcrions it has.
What problems is the product solving and how is that benefiting you?
It s great for ELT of date to use with power BI
Recommendations to others considering the product:
Use it it s the best available and it s great!


    Somu S.

Excellent infrastructure, can scale clusters in no time

  • August 16, 2020
  • Review provided by G2

What do you like best about the product?
Interactive clusters, user friendly, excellent cluster management
What do you dislike about the product?
Cluster takes some time to heat up on start, should support upsert without delta as business need pure upserts too
What problems is the product solving and how is that benefiting you?
Can seemlessly use pyspark, Python to build a robust pipeline
Recommendations to others considering the product:
It's the best infrastructure to build pipelines if you are planning to use spark in production


    Vivek P.

Databricks- Big Data processing tool

  • July 16, 2020
  • Review provided by G2

What do you like best about the product?
Very easy to use. No need to install and setup spark manually.
provides a notebook environment to write code.
support various languages like Python, Spark-SQL, R, Scala, etc.
easy to set up and use.
you can choose the cluster according to your need.
Support Machine Learning flows and Streaming Data.
Automatic suspend cluster if inactive for more than a given time( Cost-cutting)
Auto scalable Cluster.
Optimize uses of clusters (resources)
What do you dislike about the product?
No CI/ CD features given by default.
Costly for small level Enterprise.
Certification cost is high.
What problems is the product solving and how is that benefiting you?
We have to develop pipelines. We are getting data from different sources like AWS S3, redshift and we had to process that large amount of data on Databricks and put it back to our Dataware house.
Recommendations to others considering the product:
Splunk is a best tool when it comes to Big data processing. it is easy to use and setup


    Ramavtar M.

MLFlow: One stop solution for data science model tracking, versioning and deployemet

  • June 23, 2020
  • Review verified by G2

What do you like best about the product?
1) A single format to support all measure ML libraries such as Sklearn, Tensorflow, MXnet, Spark MLlib, Pyspark etc.
2) Capabilities to deploy on Amazon Sagemaker with just one API call
3) Flexibility to log all model params such as Accuracy, Recall, etc. along with Hyperparameter tuning support.
4) A good GUI to compare and select the best models.
5) Model registry to track Staging, Production, and Archived models.
6) Python best API
7) REST APIs supported.
8) Available out of the box in Microsoft Azure.
What do you dislike about the product?
1) CI/CD pipeline is not supported in the open-source version
2) Recent framework so not a very large community
3) Dependent on many python libraries. It can be a problem while resolving dependencies in your existing setup.
What problems is the product solving and how is that benefiting you?
I have used it for managing the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
The same thing can be done in Amazon sagemaker, GCP AI Platform, Microsoft Azure etc. but it would require monthly expenses. It can be good for initial startup data science team.
Recommendations to others considering the product:
It cant be a complete solution for the data science/ML engineering flow. But is essential in the pipeline. It may be used with Apache Airflow to have an end to end ML ops solution. Also, it works best with Amazon sagemaker and Microsoft Azure. However, GCP AI platform support is still in the development phase.
You would also need to take care of CI/CD pipeline for ML models on your own.


    Vikrant B.

Lightening Speed Analytics

  • April 29, 2020
  • Review provided by G2

What do you like best about the product?
DataBricks is a great analytics tool which provides lightening speed analytics and has given new abilities to Data Scientists. Additionally, our advanced analytics at scale has gone up 100 times.
What do you dislike about the product?
The learning curve is steep and people would need coding knowledge to work with Databricks. It can also be costly at times.
What problems is the product solving and how is that benefiting you?
Problems - Analytics problems

Benefits - Scale and Speed