Trino Community Broadcast episodes

78: A view with a view with a view

2026-01-16T00:00:00+00:00

Host

Manfred Moser, Sr. Principal DevRel Engineer at Chainguard, open source hacker at simpligility
Cole Bowden, Senior Developer Advocate at InfluxData

Guest

Rob Dickinson

Releases and news

Trino 478

Add support for multiple plugin directories.
Propagate queryId to the Open Policy Agent authorizer.
Add support for reading encrypted Parquet files with the Hive connector.
Add numerous performance improvements and bug fixes for the Iceberg connector.
Update Docker container to use Java 25.

Trino 479

Require Java 25 to build and run Trino.
Publish processing time for a query in the FINISHING state to event listeners.
Deprecate EXPLAIN type LOGICAL and DISTRIBUTED.
Add a extraHeaders option to support sending arbitrary HTTP headers to the JDBC driver and the CLI.
Add APPLICATION_DEFAULT authentication type for GCS.
Remove support for unauthenticated access when GCS authentication type is set to SERVICE_ACCOUNT.
Add support for setting and dropping column defaults via ALTER TABLE ... ALTER COLUMN to the memory connector.

View Manfred mentors 10 for a more detailed discussion.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Other releases and news

Trino Contributor Call minutes are available:
- October 2025
- November 2025
Trino query UI
- v0.1.1 successfully released
- Now blocked by npm process change and necessary work to adapt to it
OpenText and Vertica connector
- OpenText is looking for expression of interest from users - contact Manfred or comment on the PR for potential removal
- Working on collaboration to set up test environment with Trino project
PowerBI connector for Trino
- Manfred working with Microsoft and others to figure out future plans
- Microsoft is looking for your votes for a Trino Fabric connector
Trino 480 and Trino Gateway 17 are hopefully coming soon
Manfred mentors videos up to episode 10 now about various Trino topics

Introducing Rob

Rob tells us about his history with Trino, software engineering, and management.

A view with a view with a view

We recap Rob’s past presentation and concepts from Trino Summit 2024 about views and hierarchies of views. Then we move on to discuss all his recent development and work. There include the virtual-view-manifesto and the viewmapper and viewzoo projects.

We also chat about Rob’s journey with AI tooling.

A comparison of application code access to database storage with the different approaches of an ORM layer, a micro service and API layer, and query engine and view layer approach:

A detailed topology of an application taking advantage of virtual view hierarchies:

A concrete example of a view hierarchy for events – two swappable layers, one for mapping to physical databases, and one for calculating event priority:

Resources

virtual-view-manifesto
viewmapper for view storage
viewzoo for view visualization

Rounding out

28 Jan 2026 - Trino Contributor Call
7 Feb 2026 - Trino meetup in Bangalore
Looking for guests and topics for Trino Community Broadcast 79 and beyond

77: One tool to proxy them all

2025-10-29T00:00:00+00:00

Host

Manfred Moser, Sr. Principal DevRel Engineer at Chainguard, open source hacker at simpligility

Guest

Jordan Zimmerman, Senior Staff Engineer at Starburst
Pablo Arteaga, Software Engineer at Bloomberg

Releases and news

Trino 478 is in the final staging of getting to release. We will talk about the details in the next episode.

Other releases and news

August contributor call recap and recording is available.
New video tutorials for working on Trino and other open source projects Manfred mentors is live now and looking for sponsors. Details about the tasks are available in the contribution tracker project.

Introducing Jordan and Pablo

Manfred chats with Pablo and Jordan about their involvement in the Trino community. We end up chatting a bunch about the Airlift framework that is a foundation for Trino since Jordan has been involved in that project for a long time. Pablo has been involved in Trino itself and worked on the OPA plugin and the Trino Gateway, among other things.

aws-proxy

The AWS Proxy is an open-source Java toolkit and library, not a standalone application, designed to act as a transparent proxy for AWS Simple Storage Service (S3) compatible object storage protocols.

It was created by developers from Starburst, Bloomberg and other organizations in the Trino community to address the need for enhanced governance and security with tools like Apache Spark that lack security controls. It also supports direct data access to S3 or S3-compatible systems, like MinIO or Dell ECS.

Key functionality and use cases

Security and governance layer: The primary goal is to prevent client applications from bypassing governance systems by accessing S3 directly. It ensures all data access is channeled through the proxy, where custom business logic can be applied.
Signature handling: It handles the complex AWS Signature Version 4 (SIGv4) protocol used for authenticating requests, which was the most challenging part of its development.
Emulated credentials: Clients are configured to use fake, worthless credentials that are only recognized by the proxy. The proxy then validates the user’s identity and request against security policies (like OPA), signs the request with the real, secure AWS keys (kept safe behind the firewall), and forwards it to the real S3 store.
Extensibility: It’s built on the Airlift framework and uses a simple Service Provider Interface (SPI) plugin mechanism. This allows users to add custom logic authorization, object storage abstraction from buckets to tables, redirection, and other use cases.

In essence, it takes standard S3 requests from data tools and mediates them, applying security, control, and abstraction before forwarding them to the actual data lake storage.

Resources

Rounding out

Looking for guests and topics for Trino Community Broadcast 78
26 November 2025 - Trino Contributor Call

76: Triple platform treat

2025-09-26T00:00:00+00:00

Hosts

Manfred Moser, Sr. Principal DevRel Engineer at Chainguard, open source hacker at simpligility
Cole Bowden, Developer Advocate

Guest

Jo Perez, Founding Solutions Engineer at Collate
Shawn Gordon, Sr. Developer Advocate at Collate

Releases and news

Finally shipped a huge new release:

Trino 477

Add Lakehouse connector.
Add SQL language features including ALTER MATERIALIZED VIEW ... SET AUTHORIZATION, default column values, and ALTER VIEW ... REFRESH.
Add new SQL functions like cosine_distance() and to_geojson_geometry().
Add lots of new features to the preview UI.

There are too many connector improvements to list them all. Check out the release notes. Also inspect the changes on the SPI since there are quite a few.

Importantly, this release also includes some breaking changes.

As always, numerous performance improvements, bug fixes, and other features were added as well.

And before Trino 477 we also shipped Trino Gateway:

Trino Gateway 16

Add numerous UI improvements and fixes.
Require Java 24 and PostgreSQL 17 or higher.
Allow default routing group configuration.
Improve error propagation with external routing service.

Other releases and news

trino-1.41.0 and trino-gateway-1.16.0 Helm charts
trino-python-client 0.336.0
July contributor call recap and recording is available.
The August contributor call recap and recording from Wednesday is in the works.
Java 25 shipped and adoption in Trino is on the way.
The new trino-odbc project was contributed by Riley McDowell.
Erik Anderson is stepping up as subproject maintainer for the ODBC driver.
Pablo Arteaga will lead the new efforts for better OPA tooling and support in the trino-opa-tools repository.
We send our thanks to Cristian Osiac for his contributions as subproject maintainer for aws-proxy. He is unfortunately stepping down from this work.
Trino recently overtook the old Presto in the DB-Engines ranking.

Introducing Jo and Shawn

We chat with Jo and Shawn about their background in the big data and data lake community and beyond.

Collate

We talk about the OpenMetadata open source project as a unified platform for data discovery, observability, and governance, with 80+ data connectors and a collaborative interface.

Jo and Shawn teach us about how OpenMetadata can help build and manage high quality data assets at scale, with case studies, documentation, and community resources and we dive into how Collate offers a platform around OpenMetadata and more.

Triple platform treat

Building a modern data platform isn’t just about picking tools—it’s about creating a unified ecosystem where performance, governance, and trust work seamlessly together. See how the power trio of Trino, Collate, and Apache Ranger transforms your data operations:

Trino: Lightning-fast analytics at scale. Query across any data source, any format, anywhere—without the complexity of data movement or vendor lock-in.
Collate: Intelligent data trust and discovery AI-powered profiling, automated quality testing, and smart alerting that keeps your data reliable and discoverable.
Apache Ranger: Enterprise-grade security and governance, fine-grained access controls, policy management, and audit trails that keep your data secure and compliant.

The integration advantage: Watch these three platforms work together to deliver what every data team needs—fast queries, trusted data, and bulletproof security—all in one cohesive stack.

Jo and Shawn tell us more about “Trino + Collate + Apache Ranger = Data Platform Excellence”, talk about the components and value provided by each of them, and dive in with a demo, while Manfred and Cole ask more questions to dive deeper.

Resources

Rounding out

Trino Community Broadcast 77: One tool to proxy them all (aws-proxy) planned for October

Let us know if you want to be a guest in a future broadcast.

75: Your app sees clearly into Trino

2025-07-05T00:00:00+00:00

Hosts

Manfred Moser, Dev Rel Engineer at Chainguard, open source hacker at simpligility
Cole Bowden, Developer Advocate at Firebolt

Guest

Trevor Denning, Solutions Engineer at insightsoftware

Releases

What’s going on with our releases?

Summer slump
Reduced maintainer work
Necessary migration for Maven Central as release blocker

Other announcements:

June contributor call recap and recording
Trino Software Foundation and documentation for supporting the project on the website.

Introducing Trevor

Trevor has been developing software for over 20 years and has deep knowledge of ODBC and JDBC drivers for databases. He tells us more about his experience and how he came to learn about Trino.

More about insightsoftware

We untangle the long history of Simba, Logi Symphony, and insigtsoftware with the Trino project to the current status, before we dive into the technical details.

ODBC and JDBC

After talking a bit about Trino, Iceberg, data lakes and related topics, we get into the details about Simba Trino data connectivity with the ODBC and JDBC drivers.

Demo

Trevor shows us how you can use the ODBC driver to query Trino catalogs from Microsoft Excel, which arguably the most widely used reporting and analytics tool, despite really being a spreadsheet application. After that demo he moves on to some business intelligence analytics with PowerBI.

Resources

Rounding out

We give a quick update on where to see Cole or Manfred next, and talk about upcoming Trino events:

Meet Manfred at the Chainguard booth at the Black Hat conference in Las Vegas
Trino Contributor Call planned for the 23rd of July
Trino Community Broadcast: One tool to proxy them all (aws-proxy)

Let us know if you want to be a guest in a future broadcast.

74: Insights from a Norse god

2025-06-06T00:00:00+00:00

Hosts

Manfred Moser, Dev Rel Engineer at Chainguard @simpligility
Cole Bowden, Developer Advocate at Firebolt

Guest

Karsten Jeschkies from Grafana Labs

Releases

Following are some highlights of the recent releases:

Trino 475

Add support for the CORRESPONDING clause in set operations.
Add support for the AUTO grouping set that includes all non-aggregated columns in the SELECT clause.
Allow cross-region data retrieval when using the S3 native filesystem.
Add support for all storage classes when using the S3 native filesystem for writes.
Numerous improvements on Iceberg, Hive, and Delta Lake connectors.
SPI - Remove the LazyBlock class.

Trino 476

Another big release with lots of changes:

Require JDK 24 as runtime.
Add support for comparing values of geometry type.
Remove Example HTTP connector from binaries.
New required JVM config for BigQuery and Snowflake connectors.
Fix regression with graceful shutdown from Trino 474.
Improve performance of selective joins for federated queries for nearly all connectors.
Add columns to the $all_manifests metadata tables for Iceberg tables.
Add support for user-assigned managed identity authentication for AzureFS for object storage connectors.
Add support for the FOR TIMESTAMP AS OF clause in Delta Lake connector.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Other releases and announcements:

Trino Gateway 16 still delayed, but Trino Gateway Helm chart 1.15.2
Trino Helm chart with 475 -> 1.39.1
Trino Python client 0.334.0

Introducing Karsten and Grafana Labs

Karsten Jeschkies is an experienced software engineer:

2013 - 2016 Engineer at the Core Machine Learning team at Amazon
2016 - 2020 Mesosphere and D2IQ, maintainer of Marathon, a container orchestrator for Mesos
2020 - now Maintainer of Loki for two years and now Cloud Provider observability engineer at Grafana Labs

Grafana Labs is the home of the well-known Grafana for visualizations and dashboard and other powerful products such as Grafana Tempo, Grafana Mimir, and Grafana Loki. Grafana is also involved in well-known projects such as Prometheus and OpenTelemetry.

Log management with Loki

Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It helps you to drill into petabytes of logging data.

Analytics with Trino

Karsten tells about the motivation to create a Trino connector, how the two tools work together, what features are there, and what his plans are for the future.

Resources

Rounding out

Quick update on where to see Cole or Manfred next, and then join us for the upcoming Trino events:

Trino Contributor Call - May skipped, June edition to be determined
Trino Community Broadcast: Visualizing with Logi Symphony and ODBC
Trino Community Broadcast: One tool to proxy them all (aws-proxy)

Let us know if you want to be a guest in a future broadcast.

73: Wrapping Trino packages with a bow

2025-04-09T00:00:00+00:00

Hosts

Manfred Moser, Dev Rel Engineer at Chainguard @simpligility
Cole Bowden, Developer Advocate at Firebolt

Releases

Following are some highlights of the recent releases:

Trino 473

Add support for array literals.
Add LDAP-based group provider.
Remove the deprecated glue-v1 metastore type.
Remove the deprecated Databricks Unity catalog integration.
Remove the Kudu connector.
Remove the Phoenix connector.

But don’t use 473 since there were some breaking changes, fixed in…

Trino 474

Fix a correctness bug in GROUP BY or DISTINCT queries with a large number of unique groups.
Add originalUser and authenticatedUser as resource group selectors.
Use JDK 24 as the runtime in the Docker container.

As always, numerous performance improvements, bug fixes, and other features were added as well. Java 24 is coming as requirement soon - test the container!

Releases continue to be slower. Trino needs your help.

Other releases and announcements:

Trino Gateway 16 delayed, but Trino Gateway Helm chart 1.15.1
Trino Helm chart with 474 -> 1.38.0
New book: Core Principles and Design Practices of OLAP Engines from Yiteng Xu and Gary Gao
Massive new contribution looking for helpers - trino-query-ui

Let’s explore the query ui repo a bit more…

Application packaging and Trino

Manfred and Cole muse about the package artifacts from Trino, their history, scope and pain points:

RPM
tarball
Docker container

All of them have and had issues, and everyone knew about them. Manfred documented a lot the usage in Trino: The Definitive Guide. Finally some time in 2024 Manfred put some ideas down and in the last months implemented a lot of it.

We discuss a few aspects such as the following:

Plugin architecture of Trino
What plugins are core or optional?
Are artifacts ready to use or not?
How painful is configuration?

Demo time

In our demo session we look at some of the changes and the new trino-packages repository:

RPM removal from Trino, and replacement module
trino-server-core tarball in Trino and plugin selection
trino-server-custom module
trinodb/trino-core:latest Docker container in Trino
custom-docker module

Manfred runs a build, shows the results, and walks through the packages repository structure and instruction. To finish of we talk about next steps such as removing plugins from the default binaries and therefore making them optional.

Resources

Rounding out

Quick update on where to see Cole or Manfred next, and then join us for the upcoming Trino events:

Trino Contributor Call - 23rd of April
Trino Community Broadcast 74: One tool to proxy them all (aws-proxy)
Trino Community Broadcast 75: Insights from a Norse god (Loki connector)
Trino Community Broadcast 76: Visualizing with Logi Symphony and ODBC

Let us know if you want to be a guest in a future broadcast.

72: Keeping the lake clean

2025-03-17T00:00:00+00:00

Hosts

Manfred Moser, Dev Rel Engineer at Chainguard @simpligility
Cole Bowden, Developer Advocate at Firebolt

Guests

Viktor Kessler, Co-founder at Vakamo
Christian Thiel, Co-founder at Vakamo

Releases

Following are some highlights of the recent releases:

Trino 472

Color the server console output for improved readability.
Fix initialization failure for the DuckDB connector on Docker container.
Add support for the row type and generate empty values for array, map, and json types in the Faker connector.
Add the $partition hidden column in the Iceberg connector.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Trino Gateway 15

Pop up messages in UI
Consistent use of config.yaml
Use of OpenMetrics data from Trino clusters
Fix query errors when adhoc routing group has no healthy backends.

Introducing Viktor and Christian

We talk with Viktor and Christian about there experience in software engineering and the world of big data, and what led them to start Vakamo together.

Metastores and catalogs

We talk about data lakes, data lakehouses, object storage and the role of metadata. Details we cover include the Hive Metatstore Service, the Thrift protocol, Amazon Glue, and the new wave of catalogs. Specifically we also talk about Apache Iceberg and the Iceberg REST catalog standard as a basis for Lakekeeper, and then learn all the details about Lakekeeper.

Demo time

In their demo Viktor and Christian show a multi-user Trino cluster secured by OAuth 2, Open Policy Agent, and Lakekeeper.

Resources

Rounding out

Join us for upcoming events and let us know if you want to a guest:

Trino Community Broadcast 73: Wrapping Trino packages with a bow

71: Fake it real good

2025-02-27T00:00:00+00:00

Hosts

Manfred Moser, Director/Open Source Engineering at Starburst - @simpligility
Cole Bowden, Developer Advocate at Firebolt

Guest

Jan Waś, Software Engineer at Starburst

Releases

Following are some highlights of the recent releases:

Trino 471

Add AI functions for textual tasks on data using OpenAI, Anthropic, or other LLMs using Ollama as backend.
Add support for logging output to the console in JSON format (useful in containers..).
Support additional Python libraries for use with Python user-defined functions.
Remove the RPM package.
Add local file system support.
Add support for S3 Tables in Iceberg connector.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Trino Gateway 14

Our first Trino Gateway release of 2025 shipped, and it is packed with great new features and fixes. Some examples are the following:

Rules editor in the web interface
Automatic database schema update and support for Oracle
Trino cluster monitoring with JMX and OpenMetrics

Introducing Jan Waś

Jan, also known as nineinchnick on GitHub, is a very active Trino contributor with a wide range of his own plugins and projects. He is subproject maintainer for the Helm charts and the Grafana plugin, and is heavily involved in GitHub actions setup and numerous other efforts. Jan resides in Poland. When he is not working on Trino, you can find him at metal, electronics, and even opera concerts across Europe or at home playing video games.

Datafaker, Faker connector, and Trino

We talk about using simulated data from the TPC-H and TPC-DS connectors to learn SQL and use it for other scenarios such as benchmarking, testing for SQL support, and validating other connectors and data sources. This leads us to the limitations of these connectors and how the Faker connector is the next step.

Jan tells us about the Datafaker library and his motivation to create a connector, and how it eventually landed in Trino itself.

Demo time

Jan shows us how to configure the connector and then demoes a number of use cases from learning SQL to populating and testing other data sources.

Resources

Rounding out

Watch the recording of the Trino contributor call or read the minutes.

Join us for upcoming events and let us know if you want to a guest:

Trino Community Broadcast 72: Keeping the lake clean, all about Lakekeeper
Trino Community Broadcast 73: Wrapping Trino packages with a bow

70: Previewing a new UI

2025-02-13T00:00:00+00:00

Host

Manfred Moser, Director/Open Source Engineering at Starburst - @simpligility

Guests

Peter Kosztolanyi, Analytics Platform Lead at Wise

Releases

Following are some highlights of the Trino releases since episode 69:

Trino 470

New DuckDB connector
New Grafana Loki connector
Support for WITH SESSION for SELECT queries
Raise minimum runtime requirement to Java 11 for JDBC driver and CLI
Remove Kinesis connector
Deprecate use of the legacy file system support for Azure Storage, Google Cloud Storage, IBM Cloud Object Storage, S3 and S3-compatible object storage systems - check out the blog post

As always, numerous performance improvements, bug fixes, and other features were added as well.

Introducing Peter Kosztolanyi

Peter Kosztolanyi is the Analytics Platform Lead at Wise and he presented about their data lake with Abdullah Alkhawatrah at Trino Summit 2024. Peter has a lot of experience in the data and business intelligence fields.

He also contributes to the Trino Python client, and worked on his own phone and messaging app for iOS and Android in the past.

Trino legacy web UI

The existing main web UI for Trino has been around for a long time, and sees very limited development and maintenance. It lacks documentation, a modern look, a clean codebase, and is inconsistent across screens. It is also very technical and developer focussed, and lacks features like a SQL console to run queries.

Efforts for a new UI

While we all knew about the problems of the old UI, nobody with enough UI coding knowledge or time and motivation ever took up the banner to change the situation. We did however get a great new UI contributed in Trino Gateway, and that motivated some people in the community, especially Peter.

Peter started with the same stack, pulled in maintainers like Mateusz Gajewski and Manfred Moser, and kept working on improvements. We talk more about the following aspects:

Problems with the old UI and its technology stack
Trino Gateway UI
Roadmap issue and discussion around the new UI
What is the stack now?
Look at the codebase, tools, development, and documentation
Current status and next steps
What do we need from others?

Demo time

Peter shows us the new UI from his development setup - the latest and greatest set of features.

Resources

Rounding out

Join us for upcoming events and let us know if you want to be the next guest.

Trino contributor call, 27th of February
Trino Community Broadcast 71 with Jan Waś about the new Faker connector, 27th of February

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

69: Client protocol improvements

2025-01-30T00:00:00+00:00

Host

Manfred Moser, Director/Open Source Engineering at Starburst - @simpligility
Cole Bowden, Developer Advocate at Firebolt

Guests

Mateusz Gajewski, Sr. Staff Software Engineer at Starburst

Releases

Follow are some highlights of the first release of 2025. It took us a bit longer to work through release blockers this time:

Trino 469

Add support for the FIRST, AFTER, and LAST clauses to ALTER TABLE ... ADD COLUMN for Iceberg, MySQL, and MariaDB.
SSE-C in S3 security mapping for Delta Lake, Hive, Hudi, and Iceberg
Allow configuration for Google Cloud Storage endpoint with object storage connectors.
Allow connection validation and add more stats for JDBC driver.
Remove support for connector-level event listeners.
Misc improvements for the Faker connector.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Other news

Trino Python client 0.332.0 with spooling support
Trino contributor call

Introducing wendigo

What can we say? Top contributor and maintainer, and all around hacker on Trino, numerous Trino subprojects, Airlift, and beyond.

Main topic

Let’s talk about the Trino client protocol. Following are some topics we cover:

What is the client protocol for?
History of the client protocol
Available client drivers and client applications
Architecture and flow
Motivation to improve the protocol
Direct and spooling modes

Mateusz walks through the presentation and Cole and Manfred ask a lot of questions:

Presentation

Demo time

Mateusz show us his example and testing setup with Starburst Galaxy clusters configured for spooling protocol use and shares some of the performance gains he observes.

Resources

Rounding out

Join us for upcoming events and let us know if you want to be the next guest.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

68: Year of the Snake - Python UDFs

2025-01-16T00:00:00+00:00

Host

Manfred Moser, Director/Open Source Engineering and Trino maintainer at Starburst - @simpligility
Cole Bowden, Developer Advocate at Firebolt

Guests

David Phillips, Trino co-creator and maintainer

Releases

Follow are some highlights of the Trino releases since episode 67:

Trino 465

Add support for customer-provided SSE key in S3 file system relevant for Hive, Iceberg, Delta Lake and Hudi connectors.
Deterministic data, locale support, and random_string function for the Faker connector.
Add support for extra_properties in the Iceberg connector.
Add support for the geometry type in the PostgreSQL connector.

Trino 466

Remove Python requirement for Trino by replacing the launcher script.
Improve client protocol throughput by introducing the spooling protocol and ship it with documentation, including implementation in the JDBC driver and the CLI.
Add support for data access control with Apache Ranger, including support for column masking, row filtering, and audit logging.

Trino 467

Change default for internal communication to HTTP/1.1.
Add support for OpenTelemetry tracing to the HTTP, Kafka, and MySQL event listeners.
Remove the microdnf package manager from the Docker image.
Add the $all_manifests metadata tables in the Iceberg connector.
Add the $transactions metadata table in the Delta Lake connector.

Trino 468

Add Python user-defined functions.
Rename SQL routines to SQL user-defined functions.
Add cluster overview to the Preview Web UI.
Improve bucket execution for Hive and Iceberg.
Add support for non-transactional MERGE statements for PostgreSQL.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Other news

Trino Gateway 13
Trino Summit recap
Trino in 2024 and beyond, answer our survey!
December 2024 Trino maintainer and contributor calls took place virtually.
Trino Python client 0.332.0 includes support for spooling mode of client protocol.

User-defined functions in Trino

First there were custom plugins with user defined functions, and for a long time, that was all there is.

In 2023, David contributed SQL user-defined functions, also known as SQL routines, and we ran a competition for examples. Manfred wrote the docs and did a training session with Dain and Martin. And even back then, David had plans to add other languages, and started working on Python.

At Trino Summit in 2024 Martin Traverso announced the new upcoming feature in the keynote, and with Trino 468 we shipped support for Python user-defined functions.

Motivation

Why support Python for user-defined functions, as compared to just SQL? Simply put, more is better, and Python is everywhere. We chat with David about the details.

Development history and collaboration

David tell us more about figuring out how to make it all work at all. He touches on topics such as security, performance, deployment, monitoring, and collaboration with other projects. We also talk about why other approaches like using local CPython were discarded.

Architecture and consequences

In this discussion we talk try to cover the following topics:

How does it all work?
What are some restrictions?
What performance can users expect?

Let’s chat about this nesting:

Examples and demo

A simple example from the documentation:

FUNCTION python_udf_name(input_parameter data_type)
  RETURNS result_data_type
  LANGUAGE PYTHON
  WITH (handler = 'python_function')
  AS $$
  ...
  def python_function(input):
      return ...
  ...
  $$

David shows us more, and we talk about the details.

Feedback and future work

We are looking for feedback:

More examples for the documentation for our users
Use cases and experience testing the feature
Production deployment experiences

Future work depends on the feedback but definitely includes the following:

Performance improvements
Fine-tuning of available Python packages

Resources

Rounding out

You are all invited to chat with us about development at the Trino contributor call on the 23rd of January.
Join us on the 30th of January with Mateusz Gajewski to learn about client protocol improvements.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

67: Extra speed with Exasol and Trino

2024-10-30T00:00:00+00:00

Host

Manfred Moser, Director of Trino Community Leadership at Starburst - @simpligility
Cole Bowden, Developer Advocate at Firebolt

Guests

Thomas Bestfleisch, Senior Product Manager at Exasol

Releases and news

Follow are some highlights of the recent Trino releases:

Trino 461

Add support for the add_files and add_files_from_table procedures in the Iceberg connector.

Trino 462

Add support for read operations when using the Unity catalog as Iceberg REST catalog in the Iceberg connector.
Improve performance and memory usage when decoding data in the CLI.

Trino 463

Enable HTTP/2 for internal communication by default.
Add timezone() functions.
Include table functions with SHOW FUNCTIONS output.
Add support for writing change data feed when deletion vector is enabled to the Delta Lake connector.

Trino 464

Require JDK 23 to run Trino.
Add the Faker connector.
Add the Vertica connector.
Remove the Accumulo connector.

As always, numerous performance improvements, bug fixes, and other features were added as well.

Trino maintainer call - great sync with some exciting news coming to the community soon.
Trino contributor call - recording and minutes available now.
Trino Kubernetes operator meeting - minutes coming soon.
Trino Summit call for speakers closed - stay tuned for announcements and don’t forget to register.

Introducing Thomas and Exasol

Exasol is a lightning fast, in-memory database for analytics. And this is not just a marketing slogan. Exasol has been at the top of the TPC-H benchmarks for a long time now. Thomas tells more about the database and his role.

Exasol and Trino

Trino and Exasol bridge the gap between extreme performance with in-memory usage from Exasol, and massive scale from a lakehouse with Trino.

We learn more about Exasol as Thomas guides us through his presentation about Exasol and Trino, and take the opportunity to question him for more details.

The pull request for the Exasol connector has been a long time in the works and was finally merged for Trino 452. We talk about the motivation, the process, the results, and the future for the connector.

Resources

Rounding out

SQL basecamps before Trino Summit
Trino Summit 2024: Information about first sessions and more available. Call for speakers closed. Announcements coming soon.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

66: Chat with Trino and Wren AI

2024-09-12T00:00:00+00:00

Host

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)

Guests

Himanshu Mendapra, Software Engineer at Genuin
William Chang, CTO and Co-Founder at Canner
Yadia Colindres, Product Management Advisor at Canner

Releases and news

Trino 458

Deactivate legacy file system support for all catalogs. You must activate the desired file system support with fs.native-azure.enabled, fs.native-gcs.enabled, fs.native-s3.enabled, or fs.hadoop.enabled in each catalog using the Delta Lake, Hive, Hudi, or Iceberg connectors.
Add support for tracing with OpenTelemetry to the JDBC driver.
Reduce data transfer from remote systems for queries with large IN lists in numerous connectors.

Trino 459

Docker container now uses Java 23. Please test this and let us know of any problems since Java 23 is going to be a requirement soon.
Add support for KiB and similar data size units for the Trino CLI output.
Allow configuring maximum concurrent HTTP requests to Azure on every node
Add support for WASB to Azure Storage file system support.
Improve cache hit ratio for the file system cache.
Remove the local file connector.

Trino 460

Add support for using an Alluxio cluster as file system cache.
Add support for WASBS to Azure Storage file system support.
Remove the atop connector.
Remove the Raptor connector.
Numerous performance improvements for the Clickhouse connector.

As usual, numerous performance improvements, bug fixes, and other features have been added as well.

Updated and improved documentation for contributors for Trino, Trino Gateway, and other Trino projects.
Jan Was steps up as subproject maintainer for trino-js-client.
Cristian Osiac, Jordan Zimmermann, and Pablo Arteaga are working on aws-proxy.

Introducing Himanshu

Working at Genuin as software engineer, learning about new technologies, and occasionally contributing to open source projects like Wren AI.

Introducing William and Yadia

William is co-founder at Canner and drives everything about Canner Enterprise and Wren AI as CTO. Yadia works with William at Canner and is product manager for Wren AI.

We talk about the history of Canner and their usage of Trino in Canner Enterprise.

Pivoting to talk about Wren AI, we learn about its architecture, use cases and features, and continue along with an extensive demo of Wren AI.

Resources

Rounding out

A call out to help us clean up and close old issues.
Trino Summit 2024 is coming on the 11th and 12th of December, and registration, call for speakers, and sponsorship opportunities are open.
Join us for the next Trino Community Broadcast 67 about the Exasol database and Trino connector.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

65: Performance boosts

2024-09-12T00:00:00+00:00

Hosts

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)
Cole Bowden, Developer Advocate at Starburst

Releases and news

Trino 455

Add query starting time in QueryStatistics in all event listeners, including the new Kafka event listener.
Allow configuring endpoint for the native Azure filesystem.

Trino 456

Invalid - release process errors resulted in invalid artifacts.

Trino 457

Improve performance of queries involving joins when fault-tolerant execution is enabled.
Improve performance for LZ4, Snappy and ZSTD compression and decompression.
Publish a JDBC driver JAR without bundled, third-party dependencies.
Improve performance for concurrent write operations on S3 by using lock-less Delta Lake write reconciliation, made possible with the release of the AWS SDK with S3 conditional write support.

As usual, numerous performance improvements, bug fixes, and other features have been added as well.

Performance boosters

We chat about some of the following aspects and projects and their impact on Trino:

Role and history of Aircompressor.
Foundation from Airlift.
Relation to Java 22, and soon 23.
Status and next steps for improved and modernized file system support.
A quick glance at client protocol improvements.

Resources

Rounding out

We chat about the recent cleanup of unused Slack channels.
A call out to help us clean up and close old issues.
Check out our new video call background images.
Trino Summit 2024 is coming on the 11th and 12th of December, and registration, call for speakers, and sponsorship opportunities are open.
Join us for the next Trino Community Broadcast 66 about Wren AI and Trino.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

64: Control with Open Policy Agent OPA

2024-08-22T00:00:00+00:00

Hosts

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)
Cole Bowden, Developer Advocate at Starburst

Guests

Sebastian Bernauer, Software Developer at Stackable
Sönke Liebau, Co-Founder and CPO at Stackable

Releases and news

Trino 454

Improve performance for queries that contain multiple aggregate functions, including DISTINCT.
Add Kafka event listener plugin (yet to be documented).
Add configuration for fetch size with JDBC-based connectors (yet to be documented).
Add support for writing Deletion Vectors with the Delta Lake connector.
Add new Resources tab in the web interface with data from the new light-weight query endpoint /v1/query?pruned=true.
Add new Preview Web UI (help us test and develop!).
Add S3 security mapping for the native S3 filesystem.

As usual, numerous performance improvements, bug fixes, and other features have been added as well.

Stackable, OPA, and more

We chat with Sönke and Sebastian about the following agenda topics:

What is Stackable?
Open Policy Agent (OPA) authorization plugin
- History
- Recent development
- Compatibility layer to Trino’s file-based access control
- Quick demo on row filtering and column masking
Auto-scaling Trino clusters using trino-lb
- Differences between Trino Gateway and trino-lb

Other aspects we discuss include the following:

Performance considerations
Aspects of Trino on Kubernetes such as graceful shutdown, PodDisruptionBudgets, and anti-affinity
Plans for next steps

Other resources

Presentation slide deck
Video for Trino OPA Authorizer - Stackable and Bloomberg at Trino Summit 2023 presented by Sönke from Stackable and Pablo Arteaga from Bloomberg
Source code repo for compatibility layer between Trino classic file-based access control JSON and OPA/Trino
Longer demo video for row filtering and column masking

Rounding out

Trino Summit 2024 is coming on the 11th and 12th of December, and registration, call for speakers, and sponsorship opportunities are open.
Next Trino Community Broadcast 65 about the new Exasol connector.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

63: Querying with JS

2024-08-01T00:00:00+00:00

Hosts

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)
Cole Bowden, Developer Advocate at Starburst

Guest

Emily Sunaryo, DevRel Intern at Starburst

Releases and news

Trino 452

Add Exasol connector.
Add support for the euclidean_distance(), dot_product(), and cosine_distance() functions.
Add support for using the BigQuery Storage Read API when using the query table function with the BigQuery connector.
Add query table function for full query pass-through to the ClickHouse connector.
Numerous improvements on the Delta Lake, Hive, Hudi, and Iceberg connectors and the related file system support in Trino.

Trino 453

Improved performance for non-equality joins.
Support for setting the SQL path for JDBC driver and CLI.
New execute procedure to run arbitrary statements in the underlying data source.
Support for reading pgvector vector types in PostgreSQL connector.
Support for views when using the Iceberg JDBC catalog.

As usual, numerous performance improvements, bug fixes, and other features have been added as well.

Guest Emily Sunaryo

Emily Sunaryo is a recent UC Berkeley graduate working in the Developer Relations team at Starburst. She has a passion for both technical development and also enablement of developer communities. With her degree in Data Science, she is also interested in learning more about modern approaches to data analytics and how emerging technologies can drive innovation in this space.

Trino clients

Trino clients come in many shapes and forms, but all of them allow users to run SQL queries in Trino and access the results. They all use the Trino client REST API. To make it easier for developers of these applications, as well as any custom application, we provide a number of drivers as language-specific wrappers. These include the JDBC driver, the Python client, the Go client, and others.

JavaScript

Filipe Regadas agreed to transfer his trino-js-client project to trinodb and is now subproject maintainer. We are in the process of getting to a first release ready to ship. We would love for you to help us!

Learning about Trino

Emily’s journey and bringing it all together. From university and Starburst internship to the Trino Community Broadcast, and a working demo web application.

Demo time

Emily talks about her demo web application using React, npm, and various other libraries and tools to build a data application. The data resides in Trino, specifically in Starburst Galaxy to make the management easier, and she uses the trino-js-client in her application to run some pretty complex SQL queries again the NYC rideshare data set.

Find more details in the source code repository.

Rounding out

Trino Summit 2024 is coming on the 11th and 12th of December, and registration, call for speakers, and sponsorship opportunities are open.
Next Trino Contributor Call on the 22nd of August.
Next Trino Community Broadcast 64 with the Stackable team about OPA on the 22nd of August.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

62: A lakehouse that simply works at Prezi

2024-07-11T00:00:00+00:00

Hosts

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)
Cole Bowden, Developer Advocate at Starburst

Guest

Vincenzo Cassaro - @viciocassaro, Data Engineer at Prezi

Releases and news

Trino 451

Add support for configuring a proxy for the S3 native file system.
Add t_pdf and t_cdf functions.
Improve performance of certain queries involving window functions.
Lots of Iceberg connector improvements including support for incremental refresh for basic materialized views.

Guest Vincenzo Cassaro

Vincenzo has been working with data in all its forms, from data modeling to analytics and ML, since he completed his masters in computer engineering in Italy. He is joining us from there, more specifically from Sicily, to chat with us about how he got into computers, learned about Trino, and ended up at Prezi now.

About Prezi

Prezi probably doesn’t need any introduction, but just in case: Prezi is a popular and powerful platform to create and show engaging presentations, videos, and infographics.

A Lakehouse that simply works

With so many different technologies and vendors making proposals, it’s easy to lose track of what truly matters. We chat with Vincenzo Cassaro from Prezi about how a simple combination of established, maintained, open source technologies can make a lakehouse that truly works at the scale of a company with 150 million users.

Check out the Prezi slide deck for Vincenzo’s talk.

Rounding out

Trino Summit 2024 is coming on the 11th and 12th of December, and registration, call for speakers, and sponsorship opportunities are open.
Next Trino Contributor Call on the 25th of July.
Next Trino Community Broadcast on 1st of August.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

61: Trino powers business intelligence

2024-06-20T00:00:00+00:00

Hosts

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)
Cole Bowden, Developer Advocate at Starburst

Guest

Patrick Pichler, Owner and co-founder at Creative Data

Releases and news

Trino 449

Add OpenLineage event listener.
Add support for views when using the Iceberg REST catalog.
Improve write performance for Parquet files in Hive, Iceberg, and Delta Lake connector.
Improve equality delete performance in Iceberg connector.

Trino 450

Improve performance for the first_value(), last_value(), date_trunc(), date_add(), and date_diff() functions.
Add support for concurrent UPDATE, MERGE, and DELETE queries in Delta Lake connector.
Add support for reading UniForm tables in Iceberg connector.
Add support for TRUNCATE in Iceberg and Memory connector.
Automatically configure BigQuery scan parallelism.

First recap from Trino Fest 2024

Cole and Manfred chat a bit about Trino Fest last week, mentioning that all videos are now available, and a blog post with slides and more material is coming as well.

Impression from Trino Contributor Congregation

Manfred and Dain lead the discussions in the congregation. We are excited about a lot of the follow ups for the project and increased collaboration and innovation.

Guest Patrick Pichler

Patrick specializes in providing guidance, designing, and implementing sustainable data, analytics and AI solutions utilizing open architectures at Creative Data. He has a long history of working in the data and data platform space as user, developer, administrator, manager, consultant, and educator.

PowerBI overview

Power BI is an interactive data visualization software product suite developed by Microsoft with a primary focus on business intelligence. We talk about the different available products and features, and their usage in the community.

Trino client support options for Power BI

Typically, Power BI relies on ODBC drivers for connecting to specific data sources. Since there is no open source Trino ODBC driver however, Patrick and other clever developers have created a Power BI client that connects to Trino directly via the client REST API - the PowerBITrinoConnector. We discuss the details and limitation of both approaches, look at the source code, and learn about import and direct query modes.

Demo

Patrick showcases how to install and use the connector in his demo of Trino and Power BI.

Rounding out

Trino Summit 2024 is coming on the 11th and 12th of December, and registration is open now.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

60: Trino calling AI

2024-05-22T00:00:00+00:00

Hosts

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)
Cole Bowden, Developer Advocate at Starburst

Guest

Isa Inalcik, Principal Data Engineer at BestSecret Group

Releases and news

Trino 446

Add support for the Snowflake catalog in the Iceberg connector.
Add support for reading S3 objects restored from Glacier storage in the Hive connector.
Add support for unsupported type handling configuration in the Snowflake connector.

Trino 447

Add support for SHOW CREATE FUNCTION.
Require Java 22.
Add support for concurrent DELETE and TRUNCATE in the Delta Lake connector.
Remove support for Phoenix 5.1.x and earlier.

Trino 448

Improve performance of reading from Parquet files.
Add support for caching Glue metadata with the update to use the V2 REST interface.

Trino Gateway 8 and 9

Add support support for configurable router policies with two new policies available.
Add a Helm chart for deployment.
Add new website.

We also had a new Trino Helm chart release 0.20.0.

Jan Waś is now also subproject maintainer of the go client and the Helm charts.

Impressions from the Iceberg Summit

Last week, Cole attended the Iceberg Summit with a special Trino perspective, and we chat about his impressions and major take-aways.

Guest Isa Inalcik from BestSecret

Isa is a highly skilled data expert with over a decade of hands-on experience in software development lifecycle. He is well versed with many data tools including Trino/Starburst Enterprise Platform, Snowflake, Airflow, Apache Spark, Hive, Apache Iceberg, dbt, and others.

Trino at BestSecret

At BestSecret, a leading online retailer for fashion and lifestyle in Europe, Isa spearheads the development of efficient and resilient ELT/ETL pipelines and the implementation of data and AI-driven solutions. We chat in more details about their setup and use cases, his solutions, and challenges he is facing.

Generative AI interest and use cases

Isa has been following the waves of interest in AI and sees the following use cases related to data and Trino:

Media (Audio,Video,Image): Extract information out of images.
Object categorization: Categorize objects on images, videos.
Data masking: For anonymizing sensitive data from unstructured text.
Data extraction: To pull structured information from unstructured text.
Sentiment analysis: For gauging the sentiment of textual data.
Language detection or translation: For language detection or translating.
Summarization: To generate concise summaries from lengthy texts.

This inspired him to try an integration of the new emerging LLMs with Trino.

Trino SPI

Trino uses a service provider interface (SPI) to allow developers to create plugins for features such as connectors, security integrations and custom functions. This is crucial for business to implement required functionality and enabled Isa to work on a plugin to support custom functions that call LLMs.

The OpenAI API specification also allowed him to create one function that can be used with different LLM backends.

Proof of concept and demo

We look at the concept and implementation that Isa developed with the following architecture:

Isa’s trino-ai repository contains source code and more details as mentioned in his post on LinkedIn and used in the demo.

Other resources

Post from Isa: Maximize Performance: The Secret to Scaling Trino Clusters with KEDA
Post from Isa: Enhancing Security and Observability in Trino with Open Policy Agent and OpenTelemetry
Ollama system used to run LLMs
Trino SPI documentation, including custom function creation

Rounding out

Trino Fest news:

Finalized speaker lineup announced
Register for event and hotel now
Special thanks to our Trino Fest sponsors - Starburst as event host and Alluxio, Cloudinary, Onehouse, Startree, and Upsolver as event sponsors.
Contact us to join the Trino Contributor Congregation the next day.

59: Querying Trino with Java and jOOQ

2024-04-24T00:00:00+00:00

Host

Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)

Guest

Lukas Eder, Creator of jOOQ, (@lukaseder)

Trino releases

Trino 445

Add support for time travel queries with the Delta Lake connector.
Add support for the REPLACE modifier as part of a CREATE TABLE statement with the Delta Lake connector.
Add support for writing Bloom filters in Parquet files with the Hive connector.
Add support for dynamic filtering to the MongoDB connector.
Expand support for function pushdown in the Snowflake connector.

Lukas Eder and data geekery

Lukas is recognized as a Java Champion and well-known as a very active member of the Java community. We chat about his history and involvement in the community of Java and related open source projects, and how it lead to jOOQ and his company data geekery. Lukas also briefly talks about other products.

jOOQ

jOOQ stands for jOOQ Object Oriented Querying (jOOQ). It generates Java code from your database, and lets you build type safe SQL queries through its fluent API.

All editions of jOOQ since the 3.19 release include support for Trino. The level of support depends on the used catalog and connector, and further Trino-specific enhancements are in progress.

In our conversation and demo session with Lukas, we cover all the following aspects and a few other topics:

What is jOOQ?
What motivated the creation of jOOQ?
Discuss the great reasons for using jOOQ:
- Database first
- Typesafe SQL
- Code generation
- Active records
- Multi-tenancy
- Standardization
- Query lifecycle
- Procedures
How does it compare to ORM system like Hibernate or others like the old MyBatis
What databases are supported by jOOQ and commonly used?
Chat about some customer use cases.
Supported and required Java versions, fun with upgrades, and experience from customers.
How Lukas discovered Trino and decided to add support for it.
Challenges and interesting aspects of supporting different databases
What is next for jOOQ in general, and Trino support specifically?
Cool SQL features in Trino that might be suitable for standardization:
- Higher order functions, partially already supported in jOOQ
- Integration of object-relational database feature, such as nested collections with ARRAY or LIST.
- Potential introduction of new concepts to SQL, such as MAP.
Complexities from Trino having different catalogs and connectors, and the catalog, schema, table hierarchy.

jOOQ resources and further information:

Rounding out

Trino Fest news:

Great speaker lineup announced
More to come
Register for event and hotel now
Contact us to join the Trino Contributor Congregation the next day

Other news and events:

Manfred’s recap of Open Source Summit NA and Data Engineer Things meeting in Seattle.
Trino Contributor Call right after the episode.
Contact us to be a guest in upcoming Trino Community Broadcast episodes.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

58: Understanding your users with Trino and Mitzu

2024-04-04T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)

Guests

István Mészáros, Founder and CEO of Mitzu

Trino releases

Trino 442

Add support for configuring AWS deployment type in OpenSearch connector.
Fix a regression from 440 in Iceberg connector.

Trino 443

Ensure all files are deleted when native S3 file system support is enabled, and some other object storage connector improvements.
Add support for a custom authorization header name in Prometheus connector.

Trino 444

Update Docker image to use Java 22 for runtime.
Numerous performance improvements for the Snowflake connector.
Add support for reading BYTE_STREAM_SPLIT encoding in Parquet files.
Add support for canned access control lists with the native S3 file system.

Other Trino news

Trino Gateway 7 shipped with a new user interface thanks to a contribution from our new Starburst Trino champion Peng Wei
Status of the continuous integration and build setup with Apache Maven improved a lot thanks to our collaboration with the new Starburst Trino champion Tamas Cservenak
Trino Contributor Call recap is now available

Mitzu

Mitzu is a warehouse-native product analytics platform that revolutionizes how companies leverage their product usage data in the data lake.

By directly connecting to Trino, Mitzu eliminates the need for traditional reverse ETL processes to 3rd party applications such as Amplitude or Mixpanel. Mitzu enables real-time self-served product analytics on top of the existing data infrastructure with generated SQL queries.

In our conversation and demo session with István we cover all the following aspects and a few other topics:

What is product analytics?
Discuss some key terms, such as segmentation, funnels, and retention, and discuss what insights and benefit become available.
What are some example use cases?
What kind of products can be analyzed?
Use of Mitzu for marketing.
What other product analytics tools exist, and what sets Mitzu apart?
How is Trino involved to make Mitzu warehouse-native?
What are the advantages of being warehouse-native? What does that mean?
Compare with Mitzu on other data platforms.
Implementation details of the Mitzu and Trino integration, such as connectors, security, and client libraries
How to use Mitzu in terms of deployment and configuration.
Cool features of Mitzu.
Practical experience and customers.

Rounding out

Trino Fest news:

Speakers are selected, contact and announcement coming soon
Register now, and book travel and hotel.
Contact us to join the Trino Contributor Congregation the next day

Other news and events:

Manfred will attend Open Source Summit NA, and present a Big Data Whirlwind Tour at the inaugural Data Engineer Things meeting in Seattle.
Trino Contributor Call is now planned as monthly event with video recordings.
Check out the upcoming Trino Community Broadcast episode about jOOQ.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

57: Seeing clearly with OpenTelemetry

2024-03-14T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Trino Community Leadership at Starburst, (@simpligility)

Guests

David Phillips, co-creator of Trino and CTO at Starburst
Matt Stephenson, Senior Principal Software Engineer at Starburst

Trino releases

Trino 440

New Snowflake connector
Support for sub-queries inside UNNEST clauses
Support for row filtering and column masking with Open Policy Agent
Improved latency when filesystem caching is enabled in Delta and Iceberg connectors

Trino 441

Remove the default legacy mode for hive.security

And there is a regression for Iceberg, so wait for 442 potentially. (Update: Trino 442 is released.)

Other Trino news

Java 22 is coming to Trino
David Phillips appointed dedicated file system lead
Trino Contributor Call on the 21st of March
Japenese edition of Trino: The Definitive Guide is out

OpenTelemetry

OpenTelemetry is a widely-used collection of APIs, SDKs, and tools that instrument, generate, collect, and export telemetry data such as metrics, logs, and traces to help you analyze application performance and behavior.

In our conversation with Matt and David we cover all the following aspects, and a few other topics:

What is OpenTelemetry?
Some basic concepts like logs, spans, traces
How is this related to JMX and system data and other monitoring
What is OpenMetrics? How is it related to Prometheus?
What tools can you use with OpenTelemetry? Jaeger, Datadog, …
Reasoning to add OpenTelemetry to Trino
Implementation details
Trino documentation with local example usage with Docker containers for Trino and Jaeger
Practical experience
Demo of real world usage with Starburst Galaxy and Datadog
Bonus topic - JSON-format logging via TCP socket

Rounding out

Trino Fest 2024 and Trino Contributor Congregation are happening in June in Boston. Submit your speaker proposals now, and register for the free event as soon as you can, especially for live attendance.

Check out the upcoming Trino Community Broadcast episodes about Mitzu and jOOQ.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can get the free PDF from Starburst or buy the English, Polish, Chinese, or Japanese edition online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

56: The vast possibilities of VAST and Trino

2024-02-22T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Trino Community Leadership at Starburst (@simpligility)

Guests

Colleen Tartow, Field CTO and Head of Strategy at VAST Data.
Roman Zeyde, Senior Software Engineer at VAST Data.

Release 439

Trino 439

New caching layer for Delta Lake, Hive, and Iceberg!
Documentation for new native file system support.
Fix for setting session properties on catalogs with a . in the name.
Fix for reading Snappy data.

Trino Gateway 6

Docker container setup!

Concept of the episode: The VAST database and data platform

Part database, part data warehouse, part data lake, describing VAST in one sentence is not the easiest undertaking. You can talk about features like deep write buffers with underlying flash columnar storage, the automatic contextual layer added on top of the data, or the similarity-based global compression that more than makes up for the smaller columnar chunks and makes it so much faster to find exactly the data you’re looking for.

So what is VAST? It’s a state-of-the-art data platform. Why are we talking about it on the Trino Community Broadcast? A world-class data storage solution still needs a world-class query engine, and its speed paired with Trino’s makes for a brilliant combination. We’re diving into how it works, why it is designed the way it is, and maybe talk about the really cool performance comparison they have on their website showcasing Trino as their favorite query engine.

Check out our conversation about the VAST database, VAST data platform, the Trino connector, internal workings of the system, use case, customers and much more in the interview.

Also have a look at the presentation from Jason Russler about VAST from Trino Summit 2023.

Rounding out

Trino Fest 2024 has been announced for this summer in Boston! Make sure to check out the announcement blog post and register to attend, submit your talks, or contact Starburst for information on sponsoring!

Check out the upcoming Trino Community Broadcast episodes about OpenTelemetry and Mitzu.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

55: Commander Bun Bun peeks at Peaka

2024-01-18T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Guests

Mustafa Sakalsiz, CEO at Peaka
Ali Tekin, Principal Software Architect at Peaka

Releases 437-438

Trino 437

Support for configuring compression codecs
Support for char values in the to_utf8() and lpad() functions
Improved performance for Delta Lake queries without table statistics
Improved performance for Iceberg queries with filters on ROW columns

Trino 438

Support for access control with Open Policy Agent
Support for ALTER COLUMN ... DROP NOT NULL in Iceberg and PostgreSQL
Support for configuring page sizes in Delta Lake, Hive, and Iceberg
Better type support for the reduce_agg() function

And over in the land of the Trino Gateway…

Trino Gateway version 5 released!

Concept of the episode: Peaka

Another Trino Community Broadcast episode means another cool piece of technology that uses Trino for us to show off to the community. This time it’s Peaka, a no-code approach to date warehousing that makes it easier than ever to set up your data stack without needing a ton of complex engineering.

In their own words, Peaka is a platform that merges disparate data sources into a single data layer, letting you join and blend them, query them using SQL or natural language, and expose your data to outside users through APIs. Sounds a bit like Trino, right? That’s because underneath the hood, Trino is a key part of how they’re making it happen. In this episode, we talk to the team at Peaka about where they got started, how they’re making it easier than ever to leverage the federation that Trino is capable of, and the work they’ve done on top to integrate their platform with every SaaS data source under the sun.

Demo of the episode: Using Peaka!

If you want to see what the platform is like, then look no further. We’ll be exploring:

Connecting to data sources
Filtering and combining data
Editing and running queries, including their visual query editor
Natural language queries
Visualizing data

PR of the episode: #18719: Filesystem caching with Alluxio

Perhaps it’s a little easier to link to the issue for tracking the rollout, but however you want to present it, caching in Trino is renewed! Caching is a huge performance win for a wide variety of use cases, allowing the engine to run faster, better, and pump out query results at an unparalleled pace. This is going to lead to performance improvements for Trino queries using the supported object storage connectors, and you’ll hear more from us about it once it’s officially launched. The best part is that there’s even more coming down the line as support for it is expanded.

54: Trino 2023 wrapped

2024-01-18T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst, (@simpligility)

Guests

Martin Traverso, Trino co-creator and CTO at Starburst

Releases 434-436

Trino 434

Support for a FILTER clause to the LISTAGG function
Support reading json columns and DELETE statements in BigQuery connector

Trino 435

Support for JSON_TABLE function
Improve reliability when reading from GCS
Improve query planning performance on Delta Lake tables
Improve reliability and memory usage for inserts

Trino 436

Support for Elasticsearch 8
New OpenSearch connector
Faster selective joins on partition columns

Additional comments:

Disallow invalid configuration options with Delta Lake and Iceberg connector in 434
Separate metadata caching in numerous connectors
Various improvements for schema evolution in Hive connector
Require JDK 21.0.1 to run Trino with 436
Remove support of Elasticsearch 6 in 436
Fix minor issues for SQL routine and JSON_TABLE function users

Recap of Trino in 2023

We chat about all the developments in the Trino project and the Trino community from 2023, including the following topics:

Various statistics about the project
Features and releases
Trino Fest, Trino Summit, and other events
New Trino maintainers
Polish and Chinese editions of definitive guide published

Find more details and other topics in our blog post Trino 2023 wrapped.

Rounding out

Upcoming events in NYC and Vienna, details available in the events calendar
Trino Contributor Congregation coming soon
Trino Gateway developer sync every two week, ping Manfred for invite

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF from Starburst or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

53: Understanding your data with Coginiti and Trino

2023-11-16T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst, (@simpligility)

Guests

Matthew Mullins, CTO at Coginiti, (@mullinsms)
Roman Nestertsov, Principle Engineer at Coginiti, (@nestertsov)

Releases 431-433

Trino 431

Support for SQL routines and CREATE/DROP FUNCTION
Support for REPLACE modifier in CREATE TABLE
Improved latency for prepared statements in JDBC driver

Trino 432

Faster filtering on columns containing long strings in Parquet data.
Predicate pushdown for real and double columns in MongoDB.
Support for Iceberg REST catalog in the register_table and unregister_table procedures.
Support for BEARER authentication for Nessie catalog.

Trino 433

Improved support for Hive schema evolution.
Add support for altering table comments in the Glue catalog.

Also note that Trino 433 also includes documentation for CREATE/DROP CATALOG. Check out the third SQL training session for a demo.

SQL routine competition

Trino 431 finally delivered the long-awaited support for SQL routines. To celebrate and see what you all come up with, we are running a competition. Share your best SQL routine, and win a reward sponsored by Starburst.

Call for Java 21 testing

Java 21, the latest LTS release of Java, arrived in September 2023, and we want to take advantage of the performance improvements, language features, and new libraries. But to do so, we need your input and confirmation that everything works as expected.

Concept of the episode: JDBC driver

Java Database Connectivity (JDBC) is an important standard for any JVM-based application, that wants to access a relational database. Trino ships a JDBC driver that abstracts all the low-level details of our conversational REST API for client tools and supports various authentication mechanisms, TLS, and other features. This allows tools like Coginiti to ignore those details, and work with the community on any improvements for the benefit of all users.

Client tool focus on Coginiti

Matthew and Roman are joining us from Coginiti. Coginiti delivers higher-quality analytics faster. Coginiti provides an AI-enabled enterprise data workspace that integrates modular development, version control, and data quality testing throughout the analytic development lifecycle.

With support for Trino, Coginiti as a client tool provides access to all the configured catalogs in Trino. It enables data engineers and analyst to work together in a shared platform, reducing duplication in their work, and bringing “Don’t repeat yourself (DRY)” to analysts.

We talk about why Coginiti added support for Trino. Coginiti is not a compute platform itself, but access to many platforms enables a “data blender thinking”. So as a user you start caring less about the location and source of the database, and more about the data itself and how you can mix it together to gain better insights. Every enterprise has more than one data platform, with different data warehouses, RDBMSes, and data lakes. Matthew talks about reasons for this situation,. and how Trino as a partner platform to enables users to federate across all of these platforms when needed.

Demo of the episode: Coginiti and Trino

In the demo of Coginiti, Roman and Matthew show some of the features of the tool that enable code reuse and managing transformations on Trino. A tour through major aspects of the application gives a good impression on benefits and supported use cases.

Rounding out

Our line up for speakers and sessions for Trino Summit is nearly finalized. Join us on the 13th and 14th of December for the free, virtual event. Stay tuned for details about all the sessions soon, and in the meantime - don’t forget to register.

Our Trino SQL training series just had a successful third session yesterday, and you can check out all the material in our follow up blog posts:

There is still a chance for you to register and attend the fourth session live.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

52: Commander Bun Bun takes a bite out of Yugabyte

2023-10-26T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Guests

Denis Magda, Director of Developer Relations at Yugabyte

Releases 428-430

Unofficial highlights from Cole:

Trino 428

Reduced memory usage for GROUP BY
Simplified configuration for managing writer counts
Faster reads for small Parquet files on data lakes
Support for query options on dynamic tables in Pinot

Trino 429

Faster reading of ORC files in Hive
More types supported for schema evolution in Hive
Security improvements, including logging out of a session with the Web UI

Trino 430

Improved performance of GROUP BY
Support for setting a timezone on the session level
Table statistics in MariaDB

Concept of the episode: JDBC-based connectors

In Trino, we have a lot of connectors that are based on top of JDBC. JDBC could stand for “just da best connectors,” but it’s really Java database connectivity, and it’s one of the core APIs by which many of the most prominent connectors in the Trino ecosystem function. It’s so common, in fact, that we have an example JDBC connector in Trino to make it easier to go implement your own JDBC-based connector if you need one.

Concept of the episode: YugabyteDB

But if the topic of today’s episode is YugabyteDB, why are we talking about PostgreSQL? Well, if you’re unfamiliar with Yugabyte, lifting from their docs: “YugabyteDB is distributed PostgreSQL that delivers on-demand scale, built-in resilience, and a multi-API interface.” Distributed architecture should be a familiar concept to a community involved with a distributed query engine, and if you understand how Trino is able to leverage it, you should also understand why it makes sense to pair with Yugabyte. We’ll be discussing why Yugabyte got started, what it does differently from other databases, what it does better than other databases, and how you might want to use it with Trino.

Demo of the episode: Trino on YugabyteDB

As part of the episode, we’ll also be showing off how you can use YugabyteDB with Trino. We start with using the PostgreSQL connector, then Denis shows how to use the PostgreSQL connector to run Trino with Yugabyte. It’s always hard to explain demos in show notes, so tune into the YouTube video and take a look for yourself if you’re curious!

Rounding out

Trino Summit, the biggest Trino event of the year, is coming up on the 13th and 14th of December, and like Trino Fest, it’ll be fully virtual. If you’d like to give a talk about anything related to Trino, we’re looking for speakers now. Submit your talk here! If you’d rather attend, you can also go register to attend now.

Prior to Trino Summit, if you’d like to learn about SQL from the absolute experts, we’ve also gotten started with the Trino Training Series that we’ll be running as a buildup to the summit. The recap for the first session is live, but there’s three more to come! Register now and look forward to those great sessions starting from the ground up and ending with some key tricks and Trino specifics that even a seasoned SQL veteran may not know about.

We also have a talk about Trino on Ice and data meshes coming up in Redwood City with Slalom and Starburst. If you’re local, consider signing up and checking it out!

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

51: Trino cools off with PopSQL

2023-10-05T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Guests

Jake Peterson, Head of Customer Success at PopSQL
Matthew Peveler, Software Engineer at PopSQL, MasterOdin on GitHub

Releases 423-427

Official highlights from Martin Traverso:

Trino 423

Schema evolution for nested fields
Support for comments on materialized view columns
Support for CASCADE option in DROP SCHEMA for Clickhouse, MariaDB, MySQL, Oracle and SingleStore
Various performance improvements

Trino 424

Improved performance for JSON, CSV, text and related formats in Hive
Support for CASCADE in DROP SCHEMA for PostgreSQL and Iceberg
Improved coordinator CPU utilization for large clusters

Trino 425

Improved performance of GROUP BY.
Support for check constraints in MERGE for Delta Lake connector.
Support for the Decimal128 in MongoDB connector.

Trino 426

Support for SET/RESET SESSION AUTHORIZATION.
Improved performance of aggregations over decimal values.
Support for TRUNCATE TABLE in Delta Lake connector.
Support for Databricks 13.3 LTS.

Trino 427

Improved performance for GROUP BY and DISTINCT.
Support for pushing down UPDATE statements into connectors.
Support for reading Delta Lake tables with Deletion Vectors.
Faster writing to Parquet files in Delta Lake and Iceberg.
Support for querying tags in Iceberg.

Concept of the episode: PopSQL

It may be familiar to some of our viewers to describe an environment where key queries and dashboards are buried in someone’s personal workspace, and you have to go ask them directly every time you want to check on your metrics. When you’re running a world-class, highly-performant query engine like Trino and investing time and resources into maintaining it, shouldn’t you treat your queries like a first-class, collaborative, versioned system, too?

PopSQL, a playful spin on the word popsicle, solves the sadness that is disorganized and siloed insights by centralizing queries into a platform that has versioning, security, and a suite of collaborative tools comparable to Google Drive. Want to work with your teammate on a query? You can open up the same editor and see the same thing. Want to see what that query someone ran last week was to see how the new feature is doing? It’s there. Have a suggestion to improve something? Leave a comment. Realize your suggestion was wrong and need to undo the change? You can view past versions of the query.

PopSQL and Trino make sense together. PopSQL provides a best-in-class interface for organizing, collaborating, and working together on all of your SQL queries across the business, and Trino handles running those queries at unparalleled speeds. They go hand-in-hand for treating your data and SQL analytics as first class citizens. In today’s episode, we’ll be exploring what PopSQL is, how it integrates with Trino, and how the engineers at PopSQL have done some cool things with Trino to make the integration better than ever before. We’ll start with that last one, actually.

Concept of the episode: A new Node.js adapter for Trino

Trino in the frontend is… a tricky thing. We can go ahead and admit that the Trino web UI isn’t going to win any awards for design or functionality. And while a couple Node-based libraries exist out there, including presto-client-node and lento. But presto-client-node lacked support for streaming and had some issues handling 500 errors, and lento doesn’t quite support Trino out of the box and only supports single streams, which wasn’t ideal for PopSQL’s distributed architecture. So when PopSQl’s engineers went to build their frontend and integrate with Trino, what did they do? Build their own adapter.

We’ll talk about how it was implemented, what key features it unlocks, and why it makes using PopSQL with Trino an even better experience.

Demo of the episode: Using PopSQL with Trino

It’s hard to write show notes for a demo, because you can’t really experience the demo by reading about what’s happening. But as a surface-level overview, we’ll be going over:

Setting up a connection
The schema explorer
The SQL editor
Query scheduling

PR of the episode: #57 on trino-gateway: Release version 3

Last week, the community officially released the trino-gateway, a proxy and load balancer that enables large operations to run multiple Trino clusters in harmony with each other to serve big queries and small queries alike. If you or your organization have a need for more than one Trino cluster and want the seamless experience of being able to connect to any of them through a single interface, then check it out! It’s the product of many months of effort and should be a fantastic solution for running Trino at the absolute largest scales.

To learn more about it, you should check out the blog post announcing its first release.

Rounding out

Prior to Trino Summit, if you’d like to learn about SQL from the absolute experts, we’ve also announced the Trino Training Series that we’ll be running as a buildup to the summit. Register now and look forward to four great sessions starting from the ground up and ending with some key tricks and Trino specifics that even a seasoned SQL veteran may not know about.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

50: Celebrating 50 episodes of Trino Community Broadcast

2023-07-27T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Guests

Brian Olsen, Head of Developer Relations at Tabular (@bitsondatadev)
Dain Sundstrom, Trino co-creator and CTO at Starburst (@daindumb)

Releases 421-422

Unofficial highlights from Cole:

Trino 421

Add support for CHECK constraints in UPDATE statements.
Support for INSERT on Google Sheets.
Faster queries on MongoDB tables with row columns.

Trino 422

Faster INSERT and CREATE TABLE AS ... SELECT queries.
Support for nested fields in ADD COLUMN.
Faster Avro reader for Hive.
register_table procedure to register Hadoop tables in Iceberg.

Concept of the episode: 50!

No, that’s not a factorial, we’re just excited to have made it to 50 Trino Community Broadcast episodes. We’ve brought back some familiar faces to talk about what we’ve done, how we’ve got here, what it takes to keep an open source project ticking for over a decade, and celebrate the steps we’ve taken along the way. It’s unscripted, and the discussion carries to wherever it feels like.

Tune in to hear about the history of the Trino Community Broadcast, the upcoming Snowflake connector, and a few of the core philosophies that have kept Trino running. Manfred also shows off updates to the Trino website, highlighting all the tools, data sources, and add-ons that you can use with Trino.

Trino events

Trino Fest was a little over a month ago, and we’re publishing the last recap of all the talks to the Trino blog today! Check out our YouTube channel and the Trino website to catch up on everything you missed.

If you have an event that is related to Trino, let us know so we can add it to the Trino events calendar.

Rounding out

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

49: Trino, Ibis, and wrangling Python in the SQL ecosystem

2023-07-06T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Guest

Phillip Cloud, Principal Engineer at Voltron Data. Check out his YouTube channel!

Releases 419-420

Official highlights from Martin Traverso:

Trino 419

New array_histogram function.
Faster reading and writing of Parquet data.
Support for Nessie catalog in Iceberg connector.

Trino 420

Underscores in numeric literals (e.g. 1_000_000)
Hexadecimal, binary and octal numeric literals (e.g., 0x1a, 0b1010, 0o12)
Support for comments on view columns in Delta Lake connector.
Support for RENAME COLUMN in MongoDB connector.
Support for mixed case table names in Druid connector.
Faster queries when statistics are unavailable.

Question of the episode: What is Ibis?

Taken straight from the Ibis website, Ibis is a dataframe interface to execution engines with support for 15+ backends (including Trino!). Ibis doesn’t replace your existing execution engine, it extends it with powerful abstractions and intuitive syntax.

For those who love doing all their data-related work in Python, this allows you to write Python code that leverages the speed and power of Trino without needing to become a SQL master. For the die-hard SQL users out there, they have a guide on Ibis for SQL users that explains how it fully replaces SQL with Python code that is:

Type-checked and validated as you go.
Easier to write. Pythonic function calls with tab completion in IPython.
More composable. Break complex queries down into easier-to-digest pieces.
Easier to reuse. Mix and match Ibis snippets to create expressions tailored for your analysis.

Even if you’ve been writing SQL queries since day 1 and swear by it, opening the door to using Python for analytics creates many new possibilities, widens the possible talent pool you can work with, and gives you an entire second ecosystem to integrate with.

And ultimately, at the end of the day, the idea is that you get the ease of writing Python code with the power and performance of a blazing fast SQL engine. You get the best of both worlds, and using Ibis doesn’t lock you out of rolling up your sleeves and writing some SQL when a situation calls for it.

And you don’t need to learn different SQL dialects

Trino more or less adheres to ANSI SQL, but it implements some ANSI features that are rarely seen in other query engines, and other query engines choose to deviate in a variety of ways. This can be a headache if you’re migrating to Trino, as queries need to be re-written, re-structured, and tested to make sure they return the same results. If you got set up with Ibis, first, it would do that thinking for you, and a Python query could be converted to whatever dialect of SQL you need without any issue. It can save time, effort, headaches, or a sense of being locked into a specific SQL dialect, freeing you up to move between query engines without any pain points… because of course, you want to move to Trino, which is the best query engine.

It also needs pointing out that this allows you to federate your queries while you federate your queries.

Concept of the episode: Converting Python to SQL

Take some Python like so:

>>> import ibis
>>> movies = ibis.examples.ml_latest_small_movies.fetch()
>>> rating_by_year = movies.group_by('year').avg_rating.mean()
>>> q = rating_by_year.order_by(rating_by_year.year.desc())

And Ibis can automatically turn it into SQL that executes on Trino:

>>> con.compile(q)

SELECT year, avg(avg_rating)
FROM movies t1
GROUP BY t1.year
ORDER BY t1.year DESC

Obviously, this example is lightweight, but as queries grow more complex and sophisticated, the conversion becomes more and more worthwhile. And we mentioned that the Python code is easier to re-use, but it really is - if you want to run a similar query in conjunction with the query above, those movies and rating_by_year variables still exist, and writing some code to leverage them is a lot easier and more intuitive than setting up SQL sub-queries and aliases.

Questions for Phillip

Why is it called Ibis?
How much of a normal SQL workload do you think could be handled and run by Ibis?
How much can Ibis optimize SQL queries for performance?
Which SQL dialect has been the worst to deal with?

PR of the episode: #15026: Support INSERT in Google Sheets connector

Google Sheets is one of our not-as-talked-about connectors in Trino, but it still sees use and community updates, and we want to give that a shoutout in today’s Trino Community Broadcast. #15026 from Sebastien Bernauer adds INSERT support to the connector, so now you can read and write from Google Sheets in Trino, empowering the world of SQL-on-spreadsheets.

PR of the episode: #477 on trino.io: Add Mateusz Gajewski to maintainer list

We’ve added another maintainer to Trino! We just spend an episode introducing Manfred and James Petty as maintainers, and Mateusz is right behind them after years of effort helping Trino as a contributor and reviewer.

Trino events

Trino Fest wrapped up a few weeks ago, and we’re publishing recaps of all the talks to the Trino blog! Keep an eye on our YouTube channel and the Trino website to catch up on everything you missed.

If you have an event that is related to Trino, let us know so we can add it to the Trino events calendar.

Rounding out

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

48: What is Trino?

2023-05-31T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Releases 417-418

Official highlights from Martin Traverso:

Trino 417

Faster UNION ALL queries.
Faster processing of Parquet data in Hudi, Iceberg, Hive, and Delta Lake connectors.
Faster reads of nested row fields in Delta Lake connector.

Trino 418

Add support for EXECUTE IMMEDIATE.
Add the table_changes function in Delta Lake connector.
Faster joins on partition columns in Delta Lake, Hive, Hudi, and Iceberg connectors.
Support for fault-tolerant execution in the Oracle connector.

Question of the episode: What is Trino?

We’ve put out nearly 50 Trino Community Broadcast episodes, but we haven’t yet done the simplest, most obvious topic of them all - an exploration of what Trino is, how Trino works, and how you can run it. This week, we’re taking a step back and doing a broader overview of those things, because the world needs to know… what is Trino?

If you check the Trino documentation, it starts with a definition of what Trino isn’t. But we’ll start with what Trino is: a distributed SQL query engine written in Java. If you have a SQL query, Trino can process and run it on an extremely wide variety of data sources and return a result to you that you’d expect from that SQL query. It can run queries on traditional relational databases like Oracle, MySQL, and PostgreSQL; it works on data likes like Hive, Iceberg, Delta Lake, and Hudi; and it runs on no-SQL databases like Cassandra and MongoDB. You give Trino a query, Trino gives you results. And the best part is that it doesn’t just work, it works blazing fast.

The key thing to point out is that Trino does not store data, and it is not a database on its own. It is a query engine, designed to sit on top of databases and provide an ANSI-standard SQL interface to query whatever you’re storing your data in. In order to use Trino, you need to start by having data stored somewhere else. Of course, Trino can write data to those underlying sources with the same SQL syntax, so for the end user, it can be an all-in-one interface to those underlying data sources, an abstraction that saves users from needing to understand the differences between data being stored in Iceberg and data being stored in Oracle.

How does it work?

Trino uses a distributed architecture, with a singular coordinator node that schedules and orchestrates the workload, as well as many worker nodes that carries out tasks and processes data.

Concept of the episode: How do you run Trino?

The better question might be “how can’t you run Trino?” As the project has matured, it’s been added to various third-party tools and integrated into different apps that help make it easier to run than ever before. We have some exciting news to share on that front soon, but for now, the biggest ways to run Trino include:

Tarball

You can directly download the Trino server, manually configure it, and start it up like any other program. Clients can connect to the server from there, utilizing the web interface or the CLI to run queries. This is the most manual way to set up Trino, but it works, and it doesn’t depend on anything else. Our docs go into a ton of detail on this process.

Docker

Trino provides a Docker image that can be run through the Docker software. You start by downloading and installing Docker, create a container from the Trino image, and then you can run that image to immediately get Trino up and running. No manual configuration needed, no messing around with creating directories or files, it just works. It’s perhaps the simplest way to get Trino off the ground, and recommended for anyone trying to run it independently just to fiddle around with it. As always, you can refer to the docs for more information.

Kubernetes and Helm

Trino provides a Helm chart for use with Kubernetes, so after setting up Kubernetes, kubectl, and Helm, you can install Trino on your Kubernetes cluster with Helm. It comes with the same pre-configured image as Docker, so there’s no need to manually set that up, but in order to run queries, you’ll also need to set up a tunnel between the coordinator pod within Kubernetes and whatever machine you want to run those queries on. If this is the right setup for you, you probably already know that, and you don’t need us to go into more detail. More info is in the Trino docs.

Trino clients

On the most basic side of things, Trino provides a command-line interface and a web UI. If you want something more robust, a couple open source clients have been made in the community - one written for Python and one written in Go. There’s a couple other Python clients that will be even easier to run coming soon, and we’ll be hearing from them at Trino Fest in just two weeks.

Or…

On the not-so-free side of things, Starburst Galaxy and AWS Athena offer Trino as a cloud service, which can make life even easier.

Concept of the episode: How can you contribute to Trino?

We’ve got a page on the website dedicated to the contribution process, though we’d like to welcome anyone and everyone listening to take a crack at contributing to Trino if it’s something you’re interested in. Open source projects can always use more help, and we’d like to see community contributions whenever. From that process page, the steps are:

Sign the CLA.
Make sure your contribution is something that Trino wants/needs.
Implement your change.
Open a pull request.
Request and wait for a review.
Address review comments.
Wait for it to be merged.
Wait for the next release, and then… your code change is in Trino!

PR of the episode: #11701: Support Nessie Catalog in Iceberg connector

Nessie is a transactional catalog designed for use with data lakes like Iceberg and Delta Lake. Its key selling point is git-like version control, making it easy to view history, roll back, and see who made what adjustments when. PR #11701 allows Trino’s Iceberg connector to query Nessie, adding yet another tool and opportunity for query federation to Trino’s belt.

And though we hate to say it, Nessie might just be the only other project in the world with a mascot that can compete with Commander Bun Bun.

Trino events

Coming up in just two weeks, Trino Fest is a two-day event that will feature talks from a wide range of speakers surrounding the Trino ecosystem. As already hinted at, we’ll be hearing from a couple new Python clients, from Trino users sharing tips and tricks to maximize the utility of the software, and from community contributors adding exciting new features and extensions to Trino.

Register to attend if you’re interested and want to tune in to an awesome speaker lineup! It’s virtual and completely free to attend, so all you’ve got to do is sign up.

If you have an event that is related to Trino, let us know so we can add it to the Trino events calendar.

Rounding out

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

47: Meet the new Trino maintainers

2023-05-05T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Technical Content at Starburst (@simpligility)

Guests

James Petty, Senior Software Engineer at AWS
Also Manfred. Kind of.

Releases 411-416

Official highlights from Martin Traverso:

Trino 411

migrate procedure to convert a Hive table to Iceberg.
Join and LIKE pushdown in Ignite.
Support for DELETE in Ignite.
procedure table function for executing stored procedures in SQL Server.
Faster join queries over Hive bucketed tables.
Faster planning for tables with many columns in Hive.

Trino 412

New exclude_columns table function.
Support for ADD COLUMN in Ignite.
Support for table comments in PostgreSQL connector.
Faster sum(DISTINCT ...) queries for various connectors.

Trino 413

Support for MERGE in the Phoenix connector.
Support for table comments in the Oracle connector.
Improved performance of queries involving window functions or MATCH_RECOGNIZE.

Trino 414

Experimental support for tracing using OpenTelemetry.
Support for Databricks 12.2 LTS in Delta Lake connector.
Support for fault-tolerant execution in Redshift connector.
sequence table function.

Trino 415 and Trino 416

A whole lot of minor performance improvements.

Introducing the two new Trino maintainers

Manfred should hardly need an introduction to Trino Community Broadcast viewers, as he’s been around and hosting episodes from the beginning, and authored Trino: The Definitive Guide. In the background, he’s also been quietly working on docs, the website, and a wide variety of other initiatives in the Trino community.

James should also be familiar to anyone who has contributed on Trino. Iconically rocking a GitHub avatar of the face of Bob Ross, it’s hard to miss when he shows up on a pull request. And working on Trino as part of AWS Athena, he’s been a major engineering contributor for the last several years, with 262 commits under his belt and more on the way.

What is a maintainer?

If you don’t go clicking around on the Trino website fanatically trying to find everything you can possibly read about the project, there’s a chance you’ve never bumped into our roles page, which highlights how Trino is governed. To quote that page:

In Trino, maintainer is an active role. A maintainer is responsible for merging code only after ensuring it has been reviewed thoroughly and aligns with the Trino vision and guidelines. In addition to merging code, a maintainer actively participates in discussions and reviews. Being a maintainer does not grant additional rights in the project to make changes, set direction, or anything else that does not align with the direction of the project. Instead, a maintainer is expected to bring these to the project participants as needed to gain consensus. The maintainer role is for an individual, so if a maintainer changes employers, the role is retained. However, if a maintainer is no longer actively involved in the project, their maintainer status will be reviewed.

Or, in normal speech, a maintainer is a trusted individual with merge rights. But with great power comes great responsibility, higher standards, and an expectation to be an active steward of the Trino project. It’s not easy to become a maintainer - prior to Manfred and James, it had been over a year since the most recent maintainer was appointed. The high bar of activity, quality, and attitude is not trivial by any stretch, and so we’re excited to talk to them about the role, how they got here, and what they’re looking forward to for the future of Trino.

The path to becoming a maintainer

Manfred

When did you first start working on Trino?
What’s your proudest contribution to the project?
Have a funny story you’ve wanted to share to the world?

James

When did you first start working on Trino?
What’s your proudest contribution to the project?
Why the Bob Ross avatar?

PR of the episode: 16753: Improve TopN row number / rank performance

We normally focus on flashy and user-facing PRs for the PR of the episode, but this week, courtesy of our guest James, we’re going to highlight something that better represents the more routine work that’s going on in Trino all the time: a performance improvement.

Trino events

Trino Fest is coming up in just a couple months. Register to attend or sign up to submit a talk if you have something to share!

If you have an event that is related to Trino, let us know so we can add it to the Trino events calendar. Kevin Haley’s Getting to Know Trino in Boston was a great success, and we’d love to hear from other Trino community members who’d be interested in hosting other events!

Rounding out

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

46: Trino heats up with Ignite

2023-03-15T00:00:00+00:00

Hosts

Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Information Engineering at Starburst (@simpligility)

Guests

Jason, Senior Data Engineer at Shopee.

Releases 408-410

Official highlights from Martin Traverso:

Trino 408

New Apache Ignite connector!
Add support for writing decimal types to BigQuery.
Improve performance when reading structural types from Parquet files in Delta Lake.

Trino 409

Support for nested fields in DROP COLUMN.
Support for sorted tables in Iceberg.
Support for time type in Cassandra.
Faster aggregations containing DISTINCT.
Faster LIKE with dynamic patterns.

Trino 410

Support for the sheet table function in Google Sheets.
Better file pruning in Iceberg.

Introducing the Ignite connector to Trino

The Trino Ignite connector was added a couple releases ago in Trino 408. It’s not every day that we add a new connector to Trino, and so the topic of today’s episode is exploring the connector, what it does, and what its use cases are. After that, we are going to talk about the process of coming in as an outside engineer and contributing an entirely new connector to Trino.

What is Ignite?

Apache Ignite is an in-memory distributed database, comparable to others you may be familiar with like Redis and SingleStore. If you’re not familiar with them or with in-memory computing, the gist is that by focusing on using RAM instead of disk storage, you can create a database system which is much faster - the Ignite website advertises 10-1000x improvements. Of course, this is more expensive, too, so it thrives in settings where performance is critical.

With an initial release 7 years ago, Ignite is still a relative newcomer among in-memory databases, coming with modern bells and whistles that has it positioned to become a successor to those other, comparable databases mentioned above. It also has some key functionality that sets it apart, including a fully-distributed architecture which can use disk storage, allowing it to scale horizontally.

Contributing the Ignite connector

The Trino community and developers try their best to be active reviewers, collaborators, and participants on pull requests coming in from outside contributors. Massive contributions like the Ignite connector can take a lot of round trips, back-and-forth discussion, and work from both the contributor and the project’s maintainers to get it into a state where it is ready to merge and go live for users to try out.

To give you an idea, the pull request (PR) to contribute Ignite was opened in mid-June, 2021. It received immediate feedback from a couple maintainers, went through a few round trips with amendments, re-reviews, more edits, and then other reviews. But in an open source environment, each round trip can tend to take longer and longer. Progress stalled in November 2021, and neither Jason nor the maintainers poked the Ignite PR for nearly a year. In October 2022, as part of Trino DevRel’s roundup of stale and out-of-date pull requests, we bumped back into the work that Jason had done. The wheels began to turn again, starting slow but picking up the pace, until it returned to full and active development, with several maintainers checking in frequently until the connector was ready to go. But that’s the story from an observer, and we’ve got Jason here to go into more detail.

Questions for Jason

How was the Trino review process?
Were there any major lessons you picked up along the way?
What tips would you give to someone else looking to add something into Trino?

PR of the episode: #13493: Add support for `migrate` procedure in Iceberg

If you’ve been in the data space for a while, you may know that there’s a bit of a prevailing current in migrating from Hive to Iceberg. Out with the old, in with the new, and in with the performance gains. Yuya Ebihara, one of the Trino maintainers, has added a table procedure to Trino’s Iceberg connector to make that process much, much simpler. Rather than a slow, manual, and arduous process, if you have a Hive table stored in a file format supported by Iceberg, it’s now as simple as calling the migrate table procedure and letting it run. The procedure copies the schema, partitioning, properties, and location of the source table, then streams in all the data files from the source table to re-build it all in the Iceberg format. Neat, right?

More about Ignite

Trino events

If you have an event that is related to Trino, let us know so we can add it to the Trino events calendar.

Kevin Haley will be hosting an in-person event, Getting to Know Trino, in Boston, Massachusetts on Wednesday, April 5. You need to register in advance, so if you’re in the Boston area and interested in attending, go sign up!

Rounding out

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

45: Trino swimming with the DolphinScheduler

2023-02-23T00:00:00+00:00

Hosts

Brian Olsen, Developer Advocate at Starburst (@bitsondatadev)
Cole Bowden, Developer Advocate at Starburst

Guests

David Zollo, Apache DolphinScheduler PMC Chair
Jay Chung, Apache DolphinScheduler PMC Member
Niko Zeng, Apache DolphinScheduler Community Manager
William Guo, Apache Software Foundation Member

Recap of Trino in 2022

Highlights from the blog post The rabbit reflects on Trino in 2022 touch upon various aspects.

Release 407

Official highlights from Martin Traverso:

Trino 407

Improved performance for highly selective queries.
Improved performance when reading numeric, string and timestamp values from Parquet files.
New query table function for full query pass-through in Cassandra.
New unregister_table procedure in Delta Lake and Iceberg.
Support for writing to the change data feed in Delta Lake.

Cole’s comments:

For our contributors, we added a new action to track and ping the developer relations team on stale pull requests to further prompt maintainers to take a look. This doesn’t have any immediate impact on end users, but it’ll improve the development and contribution process.
A Kerberos fix for the Kudu connector should make using it much less of a headache on long-running Trino instances.
There were some really sophisticated performance improvements that came from shifting default config values and adding some new ones, all of which took a whole lot of testing.

More detailed information is available in the release notes for Trino 407.

What is workflow orchestration?

Workflow orchestration refers to the process of coordinating and automating complex sequence of operations known as workflows consisting of multiple interdependent tasks. This involves designing and defining the workflow, scheduling and executing the tasks, monitoring the progress and outcomes, and handling any errors or exceptions that may arise. In the context of Trino, the tasks are typically the processing of SQL queries on one or more Trino cluster and other related systems to create a data pipeline or similar automation.

Why do we need a workflow orchestration tool for building a data lake?

Building a data lake can involve many complex and interdependent data processing tasks, which can be challenging to manage and scale without a workflow orchestration tool. Sometimes we can consider tools like Trino at the center of the universe, and perhaps it would be easier to schedule SQL queries with a much simpler tool. Most companies, however, require a larger variety of tasks to build a data lake that interoperate on more than just running SQL on Trino. Even if you primarily run Trino SQL scripts to run these jobs, it is better to have an orchestration tool instead of managing all processes manually.

What is Apache DolphinScheduler?

Apache DolphinScheduler is an open-source, distributed workflow scheduling platform designed to manage and execute batch jobs, data pipelines, and ETL processes. DolphinScheduler enables users to create and manage consecutive jobs run easily, including support for different types of tasks, such as SQL statements, shell scripts, Spark jobs, Kubernetes deployments, and many others. In short, it’s a powerful and user-friendly workflow orchestration platform that enables users to automate and manage their complex data processing tasks.

Read this blog on Trino and Apache DolphinScheduler to find out more.

Does DolphinScheduler have any computing engine or storage layer?

DolphinScheduler is a powerful tool for managing and orchestrating data processing workflows across a range of computing engines and storage systems, but it does not provide its own computing or storage capabilities.

What are the differences to other workflow orchestration systems?

Airflow is the incumbent de facto workload orchestrator. Many data engineers currently rely on Airflow to handle their workflow orchestration today so it helps to understand DolphinScheduler’s benefits in relation to Airflow. Both Dolphin Scheduler and Airflow are designed to be scalable and highly available to support large-scale distributed environments.

Airflow supports a wide range of third-party integrations, including popular data processing frameworks such as Trino, Spark, and Flink, as well as with cloud services such as AWS and Google Cloud. Dolphin Scheduler supports a similar range of data processing frameworks and tools. This makes both platforms suitable for managing diverse data processing tasks.

DolphinScheduler project believes that future data governance belongs to data engineers and consumers alike and should not be centralized to a single team. Product-focused engineering teams should have access to data and be able to orchestrate workflows without the need for extensive coding skills. DolphinScheduler uses a drag and drop web UI to create and manages workflows while also providing programmatic access using tools like Python SDK and Open API.

A positive feature of DolphinScheduler supporting users outside the data team through a UI is that it offers robust security features. This includes authentication, authorization, and data encryption, to ensure that users’ data and workflows are protected.

DolphinScheduler has relatively limited documentation and community support since they are a newer project, but they are working hard to improve the developer experience and documentation.

How does DolphinScheduler deal with failures?

Failure is an inevitable aspect of data workflow orchestration. The merits of many of these orchestration tools come from how well they aid users in responding to failures by monitoring health and notifying users when things go wrong.

Does DolphinScheduler have an alarm mechanism itself?

Apache DolphinScheduler supports user notifications as part of a workflow. This mechanism is designed to help users monitor and manage their workflows more effectively and respond quickly to any issues.

These alerts can be configured to notify users via email, SMS, or other communication channels, and can include details such as the name of the workflow, the name of the failed task, and the error message or stack trace associated with the failure.

In addition to these configurable alerts, DolphinScheduler provides a dashboard for monitoring the status and progress of workflows and tasks. It includes real-time updates and visualizations of workflow performance and status. The dashboard helps users quickly identify any issues or bottlenecks in their workflows and take corrective action as needed.

Demo of the episode: Creating a simple Trino workflow in DolphinScheduler

For this episodes’ demo, we look at creating a workflow consisting of a Trino query execution managed by a workflow in DolphinScheduler.

Run the demo by following the steps listed.

PR of the episode: Improve performance of Parquet files

While we’re on the topic of data lakes, we had several performance for Parquet files in release 407 from contributor and maintainer, @raunaqmorarka. This change includes an improvement on performance of reading Parquet files for decimal types, numeric types, string types, timestamp and boolean types.

While Trino has historically had better performance for the ORC format, the Parquet file type has grown drastically in popularity and so this is one of many examples of the improving support around Parquet files for data lakes.

Find out more about DolphinScheduler

Trino events

If you have an event that is related to Trino, let us know so we can add it to the Trino events calendar.

Rounding out

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

44: Seeing clearly with Metabase

2023-01-26T00:00:00+00:00

Hosts

Manfred Moser, Director of Information Engineering at Starburst (@simpligility)

Guests

Luis Paolini, Success Engineer at Metabase
Andrew DiBiasio, Software Engineer at Starburst
Piotr Leniartek, Product Manager at Starburst

Recap of Trino in 2022

Highlights from the blog post The rabbit reflects on Trino in 2022 touch upon various aspects.

Lots of growth for the community celebrating 10 years Trino
Trino Summit, Cinco de Trino, Trino Community Broadcast, and more content
Trino: The Definitive Guide second edition

Lots of Trino releases and new features:

MERGE support
JSON functions
Table functions
Fault-tolerant execution
Upgrade to Java 17
New Delta Lake, Hudi, and MariaDB connectors
Tons and tons of performance improvements

Releases 404 to 406

Official highlights from Martin Traverso:

Trino 404 not found

Trino 405

Support for ALTER COLUMN ... SET DATA TYPE statement.
Support for Apache Arrow when reading from BigQuery.
Support for views in the Delta Lake connector.
Support for the Iceberg REST catalog.
Support for Protobuf encoding in the Kafka connector.
Support for fault-tolerant execution in the MongoDB connector.
Support for DELETE and query pushdown in the Redshift connector.
Performance improvements when reading Parquet data.

Trino 406

Support for JDBC catalog in the Iceberg connector.
Support for fault-tolerant execution in the BigQuery connector.
Support for exchange spooling on HDFS.
Support for CHECK constraints with INSERT statements.
Improved performance for Parquet files with the Delta Lake, Hive, Hudi and Iceberg connectors.

More detailed information is available in the release notes for Trino 405, and Trino 406.

We also shipped trino-python-client 0.321.0 with the following improvements:

Add support for SQLAlchemy 2.0.
Add support for varbinary query parameters.
Add support for variable precision datetime types.

What is Metabase

Metabase is the easy, open-source BI tool with the friendly UX and integrated tooling to let your company explore data on their own. Everyone in your company can ask questions and learn from your data.

Running Metabase locally is easy. Try with a container runtime and the 300 MB image:

docker run -it -p 3000:3000 metabase/metabase

Or use a JVM and the 260MB single JAR file:

wget https://downloads.metabase.com/latest/metabase.jar
java -jar metabase.jar

You can go zero to dashboard in under 6 minutes - learn more from the demo.

Core features and advantages of Metabase include the following:

Visual query build
Dashboards
Models

Metabase is a web-based application that you run on a server. You can make it available to multiple users. It uses SQL to create queries, reports, visualizations, dashboards, and more.

You can host it yourself locally, run it in your own datacenter or use the cloud:

Metabase is an open source project licensed under the GNU Affero General Public License (AGPL) license. It is written in Clojure and therefore runs on the Java virtual machine.

Following is a high-level architecture diagram:

Metabase is also the name of the company, founded in 2014. It provides an expanded version under a commercial license, a SaaS version of the application, support and others services, and manages the open source project.

Metabase is running in more than 50K instances around the world, including over 2K using the SaaS version.

History of Metabase and Trino

Metabase was first released in 2015 as version 0.9. Since the initial release it has grown to be a well known and widely used BI application.

A Presto driver was created in 2018. It directly integrated with the client REST API. With the rename of Presto to Trino, Manfred created a PR that replicates this for Trino to ensure continued support for the community. In the discussion it was decided that it would be better to use the Trino JDBC driver, similar to how other drivers for Metabase work.

After some more demand from the user and customer community, Starburst and Metabase established a collaboration, and started implementation of the current driver. Piotr led the charge, Andrew buckled down and learned Clojure, and together a first release was created and tested. The driver is now provided as an open source project managed by Starburst.

Core advantages of using Metabase with Trino

With Metabase and the driver for Trino, Trino users have access to a well established and proven open source BI tool. It is suitable for internal usage in any organization, and users can upgrade to commercial version for more demanding deployments and use cases.

The combination of Trino and Metabase also provides a number of unique benefits for Metabase users that are not available with typical drivers for systems. These are typically databases that support SQL, and are limited to the specific database.

With Trino and the driver, you have access to the following unique features:

Metabase users can connect to databases that do no yet have a Metabase driver, but are supported by Trino
Trino also enables using SQL for system that don’t support SQL such as MongoDB or Elasticsearch, and therefore allows Metabase usage with these systems.
With Trino you can join data from different catalogs in the same SQL query. This also applies to Metabase reports or visualizations.

Can I join multiple engines? Yes
Can I join SQL and no-SQL engines? YES!

ElasticSearch, Google Spreadsheets, Cassandra, Redis and others are all accessible with Trino. Specifically this also opens up querying object storage data lakes on S3 and other systems with the Hive, Delta Lake, Iceberg, and Hudi connectors - all from Metabase.

Metabase also includes support for access control for any connected datasource, all the way to row level security. This includes Trino and can be used to secure Trino access through Metabase to a large group of your Trino users, such as all BI users. It can even be used to add row level security for No SQL databases.

Demo of the episode: Metabase and Trino

Luis shows us the demo from his repository at https://github.com/paoliniluis/metabase-trino. Watch our video to see it and action, and check out the instructions in the repository to try yourself.

Real world use cases at Meesho

Meesho is India’s fastest growing internet commerce company. They provide a large retail website and support small business entrepreneurs with their platform.

Meesho relies on the Trino, Metabase and the Trino Metabase driver from Starburst for their data platform.

Piotr and Luis share more details:

Meesho needs the ability to query the lake, with high speed, concurrency and scale. It was not possible before Trino, in the form of Starburst Enterprise, and Metabase were introduced.
Meesho observes more than 13 million queries from Metabase in 10 months.
Meesho uses Metabase to add security and governance for the data assets.
A next planned step is to integrate with Metabase Model Caching to improve user experience even more.

PR of the episode

Let’s explore the code a bit, instead of focussing on a specific PR. The whole driver codebase is open source at https://github.com/starburstdata/metabase-driver.

As mentioned earlier the whole driver is written in Clojure, and Andrew tells us more about his experience writing the driver and working with the two systems.

We also talk about a recent community PR for datetime functions and the ongoing work to support model caching.

Datanova and other Trino events

We invite you all to join us for the free, virtual conference Datanova from Starburst. Trino and related tools and approaches are touched upon in many presentations and discussion.

If you have an event that is related to Trino, let us know so we can add it to the Trino events calendar.

Conclusion

Metabase and Trino are a great combination of tools. Together they unlock use cases that are difficult or impossible to implement with other tools. Give it a try!

Rounding out

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, get the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

43: Trino saves trips with Alluxio

2022-12-15T00:00:00+00:00

Hosts

Brian Olsen, Developer Advocate at Starburst (@bitsondatadev)
Manfred Moser, Director of Information Engineering at Starburst (@simpligility)

Guests

Bin Fan, VP of Open Source at Alluxio and PMC maintainer of Alluxio open source and TSC member of Presto (@binfan)
Beinan Wang, Software Engineer at Alluxio and Presto committer

The Alluxio crew at Trino Summit 2022.
From left to right: Beinan Wang, Bin Fan, Brian Olsen, Denny Lee, Hope Wang, Jasmine Wang.
Somehow Denny Lee from Delta Lake snuck in there 😉. Love the data community vibes on this one.

Concept of the episode: Data caching and orchestration

Out of all those petabytes of data you store, only a small fraction of it is creating business value for you today. When you scan the same data multiple times and transfer it over the wire, you’re wasting time, compute cycles, and ultimately money. This gets worse when you’re pulling data across regions or clouds from disaggregate Trino clusters. In situations like these, caching solutions can make a tremendous impact on the latency and cost of your queries.

Trino without caching

There seems to be a sizeable portion of the community who aren’t using a caching solution. Not all workloads will really benefit from caching. If you are performing more writes than reads, the cache will need to constantly be invalidated before performing each read. If you are scanning all your data to run daily migrations, you would not benefit from caching. However, one of the most common use cases where Trino shines is interactive adhoc analytics. This type of querying is very fast in Trino, especially when using modern storage formats like Iceberg.

Two types of caching

There are two types of caching used with Trino. The first type caches the results of a common query or sub query to optimize performance for any query that overlaps with predicates to obtain the cached results.

The other type of query is object file caching. Rather than store the results of the query, you are caching the files from a file or object store that are scanned as part of the query.

In this episode, we will focus on the latter type of caching. This will apply to connectors like Hive, Iceberg, Delta Lake, and Hudi.

Hive connector caching

Trino has an embedded caching engine in the Hive connector. This is convenient as it ships with Trino, however, it does not work outside the Hive connector. The caching engine is Rubix. While this system works for simple Hive use cases, it fails to address use cases outside of Hive and hasn’t been maintained since 2020. There are many features missing like security features and support for more compute engines.

What is Alluxio?

Alluxio is world’s first open source data orchestration technology for analytics and AI for the cloud. It provides a common interface enabling computation frameworks to connect to numerous storage systems through a common interface. Alluxio’s memory-first tiered architecture enables data access at speeds orders of magnitude faster than existing solutions. Alluxio was originally developed at Berkley Amp Labs, and was originally called Tachyon. It was less focused on caching and data orchestration and more focused on fault-tolerance via lineage and other techniques borrowed from Spark.

Alluxio lies between data driven applications, such as Trino and Apache Spark, and various persistent storage systems, such as Amazon S3, Google Cloud Storage, HDFS, Ceph, and MinIO. Alluxio unifies the data stored in these different storage systems, presenting unified client APIs and a global namespace to its upper layer data driven applications.

Alluxio is commonly used as a distributed shared caching service so compute engines talking to Alluxio can transparently cache frequently accessed data, especially from remote locations, to provide in-memory I/O throughput. Alluxio also enables unifying all data storage to a single namespace. This can make things simpler if your data is stored across different systems, have data stored in different regions, or stored across different clouds.

Source: https://docs.alluxio.io/os/user/stable/en/Overview.html

What is data orchestration?

A data orchestration platform abstracts data access across storage systems, virtualizes all the data, and presents the data via standardized APIs with global namespace to data-driven applications. In the meantime, it should have caching functionality to enable fast access to warm data. In summary, a data orchestration platform provides data-driven applications data accessibility, data locality, and data elasticity.

Source: https://www.alluxio.io/blog/data-orchestration-the-missing-piece-in-the-data-world/

Trino and Alluxio: Expedia use case

Expedia needed to have the ability to query cross cluster over different regions while simplifying the interface to their local data sources.

Source: Unifying cross-region access in the cloud at Expedia Group — The path toward data mesh in the brand world

PR of the episode: Alluxio/alluxio PR 13000 Add a doc for Trino

This episode’s PR is actually not located in a Trino repository. This PR comes from the Alluxio repository. It happened in the wake of the rebranding from Presto to Trino. PRs like this helped continue the Trino community as it grew awareness around the new name, as well as, fixed any potential issues that occurred with the hasty renaming we had to do.

This was submitted by Alluxio engineer, David Zhu. A huge thanks to David and his contributions to Trino as well!

Demo of the episode: Running Trino on Alluxio

This demo of the episode, covers how to configure Alluxio to use write-through caching to MinIO. This is done using the Iceberg connector with only one change to the location property on the table from the Trino perspective.

To follow this demo, copy the code located in the trino-getting-started repo.

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

42: Trino Summit 2022 recap

2022-11-17T00:00:00+00:00

Hosts

Brian Olsen, Developer Advocate at Starburst (@bitsondatadev)
Cole Bowden, Developer Advocate at Starburst
Manfred Moser, Director of Information Engineering at Starburst (@simpligility)

Guests

Brian Zhan, Product Manager at Starburst (@brianzhan1)
Claudius Li, Product Manager at Starburst
Dain Sundstrom, Trino creator and CTO at Starburst (@daindumb)
Martin Traverso, Trino creator and CTO at Starburst (@mtraverso)

Releases 402 to 403

Official highlights from Martin Traverso:

Trino 402

Support for column comments in Hive and Iceberg views.
Support predicate pushdown on temporal types in MongoDB connector.
Faster OR, nullif, and arithmetic operations in SQL Server connector.

Trino 403

Support for DELETE in MongoDB.
Faster aggregations.
Faster data transfers with fault-tolerant execution.
Faster SHOW SCHEMAS in BigQuery.
Faster expire_snapshots in Apache Iceberg.

More detailed information is available in the release notes for Trino 402, and Trino 403.

Trino Summit 2022 recap

This episode we’re doing a recap of both the Trino Summit and the first Trino Contributor Congregation. We dive into what everyone’s favorite Trino Summit sessions were. Then we cover key takeaways from the Trino Contributor Congregation, which took place the day after.

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

41: Trino puts on its Hudi

2022-10-27T00:00:00+00:00

Hosts

Brian Olsen, Developer Advocate at Starburst (@bitsondatadev)
Cole Bowden, Developer Advocate at Starburst

Guests

Sagar Sumit, Software Engineer at Onehouse (@sagarsumit6)
Grace (Yue) Lu, Software Engineer at Robinhood

Register for Trino Summit 2022!

Trino Summit 2022 is coming around the corner! This free event on November 10th will take place in-person at the Commonwealth Club in San Francisco, CA or can also be attended remotely!

Read about the recently announced speaker sessions and details in these blog posts:

You can register for the conference at any time. We must limit in-person registrations to 250 attendees, so register soon if you plan to attend in person!

Releases 396 to 401

Official highlights from Martin Traverso:

Trino 396

Improved performance when processing strings.
Faster writing of array, map, and row types to Parquet.
Support for pushing down complex join criteria to connectors.
Support for column and table comments in BigQuery connector.

Trino 397

S3 Select pushdown for JSON data in Hive connector.
Faster date_trunc predicates over partition columns in Iceberg connector.
Reduced query latency with Glue catalog in Iceberg connector.

Trino 398

New Hudi connector.
Improved performance for Parquet data in Delta Lake, Hive and Iceberg connectors.
Support for column comments in Accumulo connector.
Support for timestamp type in Pinot connector.

Trino 399

Faster joins.
Faster reads of decimal values in Parquet data.
Support for writing array, row, and timestamp columns in BigQuery.
Support for predicate pushdown involving datetime types in MongoDB.

Trino 400

Support for TRUNCATE in BigQuery connector.
Support for the Pinot proxy.
Improved latency when querying Iceberg tables with many files.

Trino 401

Improved performance and reliability of INSERT and MERGE.
Support for writing to Google Cloud Storage in Delta Lake.
Support for IBM Cloud Object Storage in Hive.
Support for writes with fault-tolerant execution in MySQL, PostgreSQL, and SQL Server.

Additional highlights worth a mention according to Cole:

The new Hudi connector is worth mentioning twice. It was in the works for a while, and we’re really excited it has arrived and continues to improve.
Trino 396 added support for version three of the Delta Lake writer, then Trino 401 added support for version four, so we’ve jumped from two to four since the last time you saw us!
There have been a ton of fixes to table and column comments across a wide variety of connectors.

More detailed information is available in the release notes for Trino 396, Trino 397, Trino 398, Trino 399, Trino 400, and Trino 401.

Concept of the week: Intro to Hudi and the Hudi connector

This week we’re talking about the Hudi connector that was added in version 398.

What is Apache Hudi?

Apache Hudi (pronounced “hoodie”) is a streaming data lakehouse platform by combining warehouse and database functionality. Hudi is a table format that enables transactions, efficient upserts/deletes, advanced indexing, streaming ingestion services, data clustering/compaction optimizations, and concurrency.

Hudi is not just a table format, but has many services aimed at creating efficient incremental batch pipelines. Hudi was born out of Uber and is used at companies like Amazon, ByteDance, and Robinhood.

Merge on read (MOR) and copy on write (COW) tables

The Hudi table format and services aim to provide a suite of tools that make Hudi adaptive to realtime and batch use cases on the data lake. Hudi will lay out data following merge on read, which optimizes writes over reads, and copy on write, which optimizes reads over writes.

Hudi metadata table

The Hudi metadata table can improve read/write performance of your queries. The main purpose of this table is to eliminate the requirement for the “list files” operation. It is a result from how Hive-modelled SQL tables point to entire directories versus pointing to specific files with ranges. Using files with ranges help prune out files outside the query criteria.

Hudi data layout

Hudi uses multiversion concurrency control (MVCC), where compaction action merges logs and base files to produce new file slices a cleaning action gets rid of unused/older file slices to reclaim space on the file system.

Robinhood Trino and Hudi use cases

One of the well-known users of Trino and Hudi is Robinhood. Grace (Yue) Lu, who joined us at Trino Summit 2021, covers Robinhood’s architecture and use cases for Trino and Hudi.

Robinhood ingests data via Debezium and streams it into Hudi. Then Trino is able to read data as it becomes available in Hudi.

Hudi and Trino support critical use cases like IPO company stock allocation, liquidity risk monitoring, clearing settlement reports, and generally fresher metrics reporting and analysis.

The current state of the Trino Hudi connector

Before we had the official Hudi connector, many, like Robinhood, had to use the Hive connector. They were therefore not able to take advantage of the metadata table and many other optimizations Hudi provides out of the box.

The connector gets around that and now enables using some Hudi abstractions. However, the connector is currently limited to read-only mode and doesn’t support writes. Spark is the primary system used to stream data to Trino in Hudi. Check out the demo to see the connector in action.

Upcoming features in Hudi connector

First we want to get the read support improved and support all query types. As a next step we aim to add DDL support.

The connector only supports copy on write tables, and soon we will add merge on read table support.
Hudi has multiple query types. Adding snapshot querying support will be coming shortly.
Integration with metadata table.
Utilize the column statistics index.

PR 14445: Fault-tolerant execution for PostgreSQL and MySQL connectors

This PR of the episode was contributed by Matthew Deady (@mwd410). The improvements enable writes to PostgreSQL and MySQL when fault-tolerant execution is enabled (retry-policy is set to TASK or QUERY). This update included a few changes to core classes used for connectors using JDBC clients for Trino to connect to the database. For example, Matthew was able to build on this PR by adding a few additional changes to get this working in SQL Server in PR 14730.

Thank you so much to Matthew for extending our fault-tolerant execution to connectors using JDBC clients! As usual, thanks to all the reviewers and maintainers who got these across the line!

Demo: Using the Hudi Connector

Let’s start up a local Trino coordinator and Hive metastore. Clone the repository and navigate to the hudi/trino-hudi-minio directory. Then start up the containers using Docker Compose.

git clone git@github.com:bitsondatadev/trino-getting-started.git
cd community_tutorials/hudi/trino-hudi-minio
docker-compose up -d

For now, you will need to import data using the Spark and Scala method we detail in the video. Eventually we will provide a SparkSQL in the near term, and update this to show the Trino DDL support when it lands.

SHOW CATALOGS;

SHOW SCHEMAS IN hudi;

SHOW TABLES IN hudi.default;

SELECT COUNT(*) FROM hudi.default.hudi_coders_hive;

SELECT * FROM hudi.default.hudi_coders_hive;

Events, news, and various links

Blog posts

Fresher Data Lake on S3

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

40: Trino's cold as Iceberg!

2022-09-08T00:00:00+00:00

Looks like Commander Bun Bun is safe on this Iceberg
https://joshdata.me/iceberger.html

Hosts

Brian Olsen, Developer Advocate at Starburst (@bitsondatadev)
Cole Bowden, Developer Advocate at Starburst

Guests

Ryan Blue, creator of Iceberg and CEO at Tabular (@rdblue)
Sam Redai, Developer Advocate at Tabular (@samuelredai)
Tom Nats, Director of Customer Solutions at Starburst

Register for Trino Summit 2022!

Trino Summit 2022 is coming around the corner! This free event on November 10th will take place in-person at the Commonwealth Club in San Francisco, CA or can also be attended remotely! If you want to present, the call for speakers is open until September 15th.

You can register for the conference at any time. We must limit in-person registrations to 250 attendees, so register soon if you plan on attending in person!

Releases 394 to 395

Official highlights from Martin Traverso:

Trino 394

JSON output format for EXPLAIN.
Improved performance for LIKE expressions.
query table function in BigQuery connector.
INSERT support in BigQuery connector.
TLS support in Pinot connector.

Trino 395

Faster INSERT queries.
Better performance for large clusters.
Improved memory efficiency for aggregations and fault tolerant execution.
Faster aggregations over decimal columns.
Support for dynamic function resolution.

Additional highlights worth a mention according to Cole:

The improved performance of inserts on Delta Lake, Hive, and Iceberg is a huge one. We’re not entirely sure how much it’ll matter in production use cases, but some of the benchmarks suggested it could be massive - one test showed a 75% reduction in query duration.
Dynamic function resolution in the SPI is going to unlock some very neat possibilities down the line.

More detailed information is available in the release notes for Trino 394, and Trino 395.

Concept of the week: Latest features in Apache Iceberg and the Iceberg connector

It has been over a year since we had Ryan on the Trino Community Broadcast as guest to discuss what Apache Iceberg is and how it can be used in Trino. Since then, the adoption of Iceberg in our community has skyrocketed. Iceberg is delivering as a much better alternative to the Hive table format.

The initial phase of the Iceberg connector in Trino aimed to provide fast and interoperable read support. A typical usage was Trino alongside other query engines like Apache Spark which supported many of the data modification language (DML) SQL features on Iceberg. One of the biggest requests we got as adoption increased was the ability to do everything through Trino. This episode dives into some of the latest features that were missing from the early iterations of the Iceberg connector and what has changed in Iceberg as well!

What is Apache Iceberg?

Iceberg is a next-generation table format that defines a standard around the metadata used to map data to a SQL query engine. It addresses a lot of the maintainability and reliability issues many engineers experienced with the way Hive modeled SQL tables over big data files.

One common confusion to point out is that table format is not equivalent to file formats like ORC or Parquet. The table format is the layer that maintains metadata mapping these files to the concept of a table and other common database abstractions.

This episode assumes you have some basic knowledge of Trino and Iceberg already. If you are new to Iceberg or need a refresher, we recommend the two older episodes about Iceberg and Trino basics:

Why Iceberg over other formats?

There has been some great advancements to big data technologies that brought back SQL and data warehouse capabilities. However, Hive and Hive-like table formats are still missing some capabilities due to limitations that Hive tables have, such as dropping and reintroducing stale data unintentionally. On top of that, Hive tables require a lot of knowledge of Hive internals. Some recent formats aim to remain backwards compatible with Hive, but inadvertently reintroduce these limitations.

This is not the case with Iceberg. Iceberg has the most support for query engines and puts a heavy emphasis on being a format that is interoperable. This improves the level of flexibility users have to address a wider array of use cases that may involve querying over a system like Snowflake or a data lakehouse running with Iceberg. All of this is made possible by the Iceberg specification that all these query engines must follow.

Finally, a great video presented by Ryan Blue that dives into Iceberg is, “Why you shouldn’t care about Iceberg.”

Metadata catalogs

Catalogs, in the context of Iceberg, refer to the central storage of metadata. Catalogs are also used to provide the atomic compare-and-swap needed to support serializable isolation in Iceberg. We’ll refer to them as metadata catalogs to avoid confusion with Trino catalogs.

The two existing catalogs supported in Trino’s Iceberg connector are the Hive Metastore Service and the AWS metastore counterpart of the Hive Metastore, Glue. While this provides a nice migration from the Hive model, many are looking to replace these rather cumbersome catalogs with something that’s lightweight. It turns out that the Iceberg connector only uses the Hive Metastore Service to point to top-level metadata files in Iceberg while the majority of metadata exist in the metastore files in storage. This makes it even more compelling to get rid of the complex Hive service in favor of simpler services. Two popular catalogs outside of these are the JDBC catalog and the REST catalog.

There are two PRs in progress to support these metadata catalogs in Trino:

Branching, tagging, and auditing, oh my!

Another feature set that is coming in Iceberg is the ability to use refs to alias your snapshots. This would enable branching and tagging behavior similar to git and treating the snapshot as a commit. This is yet another way that simplifies moving between known states of the data in Iceberg.

On a related note, branching and tagging will eventually be used in the audit integration in Iceberg. Auditing allows you to push a soft commit by making a snapshot available, but it is not initially published to the primary table. This is achieved using Spark and setting the spark.wap.id configuration property. This enables interesting patterns like Write-Audit-Publish (WAP) pattern, where you first write the data, audit it using a data quality tool like Great Expectations, and lastly publish the data to be visible from the main table. Currently, auditing has to use the cherry-pick operation to publish. This becomes more streamlined with branching and tagging.

The Puffin file format

The Puffin file format is an alternative to Parquet and ORC. This format stores information such as indexes and statistics about data managed in an Iceberg table that cannot be stored directly within the Iceberg manifest. A Puffin file contains arbitrary pieces of information called “blobs”, along with metadata necessary to interpret them.

This format was proposed by long-time Trino maintainer, Piotr Findeisen @findepi, to address a performance issue noted when using Trino on Iceberg. The Puffin format is a great extension for those using Iceberg tables, as it enables better query plans in Trino at the file level.

pyIceberg

The pyIceberg library is an exciting development that enables users to read their data directly from Iceberg into their own Python code easily.

Trino Iceberg connector updates

MERGE (PR)
UPDATE (PR)
DELETE (PR)
Time travel (PR) was initially released in version 385, the @ syntax for snapshots/time travel was deprecated in version 387, and there were two bug fixes for this feature in versions 386 and 388.
Partition migration (PR) While Trino was able to read tables with these migrations applied by other query engines, this feature allows Trino to write these changes.
The following three features are table maintenance commands.
- optimize (PR) which is the equivalent to the Spark SQL rewrite_data_files.
- expire_snapshots (PR) and uses the equivalent name in Spark.
- remove_orphan_files (PR) and uses the equivalent name in Spark.
Iceberg v2 support (PR1, PR2, PR3, PR4, PR5, and many more…)

Almost every release has some sort of Iceberg improvement around planning or pushdown. If you want all the latest features and performance improvements described here, it’s important to keep up with the latest Trino version.

PR 13111: Scale table writers per task based on throughput

This PR of the episode was contributed by Gaurav Sehgal (@gaurav8297) to enable Trino to automatically scale writers. This PR aims to the number of task writers per worker.

You can enable this feature by setting scale_task_writers true in your configuration. Its initial test results are showing a sixfold speed increase.

Thank you so much to Gaurav and all the reviewers that got this PR through!

Demo: DML operations on Iceberg using Trino

For this demo of the episode, we use the same schema as the demo we ran in episode 15, and revise the syntax to include new features.

Let’s start up a local Trino coordinator and Hive metastore. Clone the repository and navigate to the iceberg/trino-iceberg-minio directory. Then start up the containers using Docker Compose.

git clone git@github.com:bitsondatadev/trino-getting-started.git
cd iceberg/trino-iceberg-minio
docker-compose up -d

Now open up your favorite Trino client and connect it to localhost:8080 to run the following commands:

/**
 * Make sure to first create a bucket names "logging" in MinIO before running
 */
CREATE SCHEMA iceberg.logging
WITH (location = 's3a://logging/');

/**
 * Create table
 */
CREATE TABLE iceberg.logging.logs (
   level varchar NOT NULL,
   event_time timestamp(6) with time zone NOT NULL,
   message varchar NOT NULL,
   call_stack array(varchar)
)
WITH (
   format_version = 2, -- New property to specify Iceberg spec format. Default 2
   format = 'ORC',
   partitioning = ARRAY['day(event_time)','level']
);

/**
 * Inserting two records. Notice event_time is on the same day but different hours.
 */

INSERT INTO iceberg.logging.logs VALUES
(
  'ERROR',
  timestamp '2021-04-01 12:23:53.383345' AT TIME ZONE 'America/Los_Angeles',
  '1 message',
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
),
(
  'ERROR',
  timestamp '2021-04-01 13:36:23' AT TIME ZONE 'America/Los_Angeles',
  '2 message',
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
);

SELECT * FROM iceberg.logging.logs;
SELECT * FROM iceberg.logging."logs$partitions";

/**
 * Notice one partition was created for both records at the day granularity.
 */

/**
 * Update the partitioning from daily to hourly 🎉
 */
ALTER TABLE iceberg.logging.logs
SET PROPERTIES partitioning = ARRAY['hour(event_time)'];

/**
 * Inserting three records. Notice event_time is on the same day but different hours.
 */
INSERT INTO iceberg.logging.logs VALUES
(
  'ERROR',
  timestamp '2021-04-01 15:55:23' AT TIME ZONE 'America/Los_Angeles',
  '3 message',
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
),
(
  'WARN',
  timestamp '2021-04-01 15:55:23' AT TIME ZONE 'America/Los_Angeles',
  '4 message',
  ARRAY ['bad things could be happening']
),
(
  'WARN',
  timestamp '2021-04-01 16:55:23' AT TIME ZONE 'America/Los_Angeles',
  '5 message',
  ARRAY ['bad things could be happening']
);

SELECT * FROM iceberg.logging.logs;
SELECT * FROM iceberg.logging."logs$partitions";

/**
 * Now there are three partitions:
 * 1) One partition at the day granularity containing our original records.
 * 2) One at the hour granularity for hour 15 containing two new records.
 * 3) One at the hour granularity for hour 16 containing the last new record.
 */

SELECT * FROM iceberg.logging.logs
WHERE event_time < timestamp '2021-04-01 16:55:23' AT TIME ZONE 'America/Los_Angeles';

/**
 * This query correctly returns 4 records with only the first two partitions
 * being touched. Now let's check the snapshots.
 */


SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

/**
 * Update
 */
UPDATE
  iceberg.logging.logs
SET
  call_stack = call_stack || 'WHALE HELLO THERE!'
WHERE
  lower(level) = 'warn';

SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

/**
 * Read data from an old snapshot (Time travel)
 *
 * Old way: SELECT * FROM iceberg.logging."logs@2806470637437034115";
 */

SELECT * FROM iceberg.logging.logs FOR VERSION AS OF 2806470637437034115;

/**
 * Merge
 */
CREATE TABLE iceberg.logging.src (
   level varchar NOT NULL,
   message varchar NOT NULL,
   call_stack array(varchar)
)
WITH (
   format = 'ORC'
);

INSERT INTO iceberg.logging.src VALUES
 (
   'ERROR',
   '3 message',
   ARRAY ['This one will not show up because it is an ERROR']
 ),
 (
   'WARN',
   '4 message',
   ARRAY ['This should show up']
 ),
 (
   'WARN',
   '5 message',
   ARRAY ['This should show up as well']
 );

MERGE INTO iceberg.logging.logs AS t
USING iceberg.logging.src AS s
ON s.message = t.message
WHEN MATCHED AND s.level = 'ERROR'
        THEN DELETE
WHEN MATCHED
    THEN UPDATE
        SET message = s.message || '-updated',
            call_stack = s.call_stack || t.call_stack;

DROP TABLE iceberg.logging.logs;

DROP SCHEMA iceberg.logging;

This is just the tip of the iceberg that shows the powerful MERGE statement and the other features we have added to Iceberg!

Events, news, and various links

Blog posts

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

39: Raft floats on Trino to federate silos

2022-08-18T00:00:00+00:00

Guests

In this episode, we are talking to two engineers from Raft and discuss how they use Trino to connect data silos that exist across different departments in various government sectors:

Edward Morgan, Senior Platform Engineer/DevSecOps Manager at Raft
Steve Morgan, Chief Data Engineer at Raft

Register for Trino Summit 2022!

Trino Summit 2022 is coming around the corner! This will be a hybrid event on November 10th that will take place in-person at the Commonwealth Club in San Francisco, CA and can also be attended remotely! If you want to present, the call for speakers is open until September 15th.

You can register for the conference at any time. We must limit in-person registrations to 250 attendees, so register soon if you plan on attending in person!

Releases 392 to 393

Official highlights from Martin Traverso:

Trino 392

Support for dynamic filtering with fault-tolerant query execution.
Support for correlated subqueries in DELETE queries.
Support for Amazon S3 Select pushdown for JSON files.
Support for Avro format in Iceberg connector.
Faster queries when filtering by __time column in Druid.

Trino 393

Add support for MERGE.
Improved performance of highly selective LIMIT queries.
Experimental docker image for ppc64le.
Dynamic filtering support for various connectors.
Support for JSON and bytes type in Pinot.

Additional highlights worth a mention according to Manfred:

Lots of other improvements on Delta Lake, Hive, and Iceberg connectors.
Merge support in a bunch of connectors.
OAuth 2.0 refresh token fixes

More detailed information is available in the release notes for Trino 392, and Trino 393.

Concept of the episode: Trino at Raft

Raft provides consulting services and is particularly skilled at DevSecOps. One particular challenge they face is dealing with fragmented government infrastructure. In this episode, we dive in to learn how Trino enables Raft to supply government sector clients with a data fabric solution. Raft takes a special stance on using and contributing to open source solutions that run well on the cloud.

Intro to software factories

A “software factory” is an organized approach to software development that provides software design and development teams a repeatable, well-defined path to create and update software. It results in a robust, compliant, and more resilient process for delivering applications to production” – VMWare

This is a push against the previous attempts from larger government contractors who tried to build one-size-fits-all solutions that ultimately failed. The new wave of government solutions relies on methodologies similar to the software industry that append more rules and standards around technologies they can adopt in the stack.

Software factories are now a common practice for government agencies to use, as they are able to take standardized software stacks that go through rigorous validation to make sure the meet the standards of the government. One important element to these stacks are that they can be deployed in virtually any environment. A common way to do this is using Kubernetes and containers.

Standards and anatomy of a stack

With the movement towards standardization, government contractors will generally build their stack using Kubernetes templates. Kubernetes underpins each of these stacks while telemetry, monitoring, and policy agents are layered on after that. For Raft, they wanted to provide a “single pane of glass” over the existing fragmented systems that the Department of Defense (DoD) operates on. They began to develop a stack that included Trino as their method to connect data over various silos.

Data Fabric at Raft

Data Fabric is an attempt to provide government agencies the ability to set up a data mesh that is backed by Trino. Trino fits well in this narrative as it provides SQL-over-everything. Data analysts and data scientists only need to know SQL.

Data Fabric MVP is an end-to-end DataOps capability that can be deployed at the edge, in the cloud, and in disconnected environments within minutes. It provides a single control plane for normalizing and combining disparate data lakes, platforms, silos, and formats into SQL using Trino for batch data and Apache Pinot for user facing streaming analytics.

Data Fabric is driven by cloud native policy using Open Policy Agent (OPA) integrated with Trino and Kafka to provide row and column level obfuscation. It provides enterprise data catalog to view data lineage, properties, and data owners from multiple data platforms. – Raft

Security concerns around Trino

A common first question the Raft team gets asked is around Trino being a high security concern. The idea that Trino can connect to multiple data sources from one location brings up fear that individuals may gain access to information at a higher classification level than they have. The team has to educate the different users on the best practices and how to ensure this problem doesn’t occur. You will need a separate deployment of Data Fabric for each classification level and correctly identify policies in OPA that restrict visibility to information above a users’ clearance.

Iron Bank container repository

Iron Bank is a central repository of digitally-signed container images, including open-source and commercial off-the-shelf software, hardened to the DoD’s exacting specifications. Approved containers in Iron Bank have DoD-wide reciprocity across all classifications, accelerating the security approval process from months or even years down to weeks.

To be considered for inclusion into Iron Bank, container images must meet rigorous DoD software security standards. It is an extensive, continuous, complicated effort for even the most sophisticated IT teams. Continuously maintaining and managing hardening pipelines while incorporating evolving DoD specifications and addressing new vulnerabilities (CVEs) can severely stretch your resources, even if you have advanced tooling and experience in-house. (Source)

The Trino Docker image is available in Iron Bank and is maintained by folks at Booz Allen Hamilton. Their hard work makes it possible for Trino to be deployed in DoD environments.

Pull requests of the episode: PR 13354: Add S3 Select pushdown for JSON files

This PR of the episode was contributed by preethiratnam. This pull request enables S3 pushdown during a SELECT operation for JSON files. The pushdown logic is restricted to only root JSON fields, similar to CSV. S3 select does support nested column filtering on JSON files, which is planned for another PR at a later time to limit the scope.

It’s already expensive enough to query JSON files, as you pay a hefty penalty for deserialization. This at least filters out a lot of rows. Thanks to Andrii Rosa arhimondr for the review.

Demo of the episode: Running Great Expectations on a Trino Data Lakehouse Tutorial

For this episode’s demo, you’ll need a local Trino coordinator, MinIO instance, Hive metastore, and an edge node where various data libraries like Great Expectations can run. Clone the trino-datalake repository and navigate to the root directory in your cli. Then start up the containers using Docker Compose.

git clone git@github.com:bitsondatadev/trino-datalake.git

cd trino-datalake

docker-compose up -d

The rest of the demo is available in this markdown tutorial and is covered in the video demo below.

Question of the episode: How can I deploy Trino on Kubernetes without using Helm charts?

Full question from Trino Slack

This user was not able to use Helm, due to some restriction in his company. They needed the raw kubernetes yaml files to deploy Trino.

Answer: While there are very nice ways that Helm offers to directly deploy to a service that understands Helm charts, you can also use Helm on your machine to generate all the kubernetes yaml configuration files. This can be done using the helm template command. See more on this from the Trinetes episode that details this command.

Events, news, and various links

Blogs

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

38: Trino tacks on polymorphic table functions

2022-07-21T00:00:00+00:00

Guests

In this episode we have the pleasure to chat with a couple familiar faces who have been hard at work building and understanding the features we’re talking about today:

Kasia Findeisen, Trino Maintainer
Martin Traverso, Trino Cocreator and Maintainer

Releases 387 to 391

Trino 387

Support for writing ORC Bloom filters for varchar columns.
Support for querying Pinot via the gRPC endpoint.
Support for predicate pushdown on string columns in Redis.
Support for OPTIMIZE on Iceberg tables with non-identity partitioning.

Trino 388

Support for JSON output in EXPLAIN.
Improved performance for row data types.
Support for OAuth 2.0 refresh tokens.
Support for table and column comments in Delta Lake.

Trino 389

Improved performance for row type and aggregation.
Faster joins when spilling to disk is disabled.
Improved performance when writing non-structural types to Parquet.
New raw_query table function for full query pass-through in Elasticsearch.

Trino 390

Support for setting comments on views.
Improved UNNEST performance.
Support for Databricks runtime 10.4 LTS in Delta Lake connector.

Trino 391

Support for AWS Athena partition projection.
Faster writing of Parquet data in Iceberg and Delta Lake.
Support for reading BigQuery external tables.
Support for table and column comments in BigQuery.

Additional highlights and notes according to Manfred:

Java 17 arrived as required runtime in 390.
Remove support for Elasticsearch versions below 6.6.0, add testing for OpenSearch 1.1.0.
New raw query table function in Elasticsearch can replace old full text search and query pass-through support.

More detailed information is available in the release notes for Trino 387, Trino 388, Trino 389, Trino 390, and Trino 391.

Concept of the episode: Polymorphic table functions

We normally cover a broad variety of topics in the Trino community broadcast, exploring different technical details, pull requests, and neat things that are going on in Trino at large. This episode, however, we’re going to be more focused, only taking a look at a particular piece of functionality that we’re all very excited about: polymorphic table functions, or PTFs for short. If you’re unfamiliar with what this means, that can sound like technobabble word soup, so we can start exploring this with a simple question…

What is a table function?

The easiest answer to this question is that it’s a function which returns a table. Scalar, aggregate, and window functions all work a little differently, but ultimately, they all return a single value each time they are invoked. Table functions are unique in that they return an entire table. This gives them some interesting properties that we’ll dive into, but it also means that you can only invoke them in situations where you’d use a full table, such as a FROM clause:

SELECT
    *
FROM
    TABLE(my_table_function('foo'));

You can also use table functions in joins:

SELECT
    *
FROM
    TABLE(my_table_function('bar'))
JOIN
    TABLE(another_table_function(1, 2, 3))
    ON true;

And while that’s all neat, it begs the question…

What can you do with table functions?

While standard table functions are cool, they have to return a pre-defined schema, which limits their flexibility. However, they still have some interesting uses as means of shortening queries or performing multiple operations at once. If you frequently find yourself selecting from the same table with a WHERE clause checking equality to a specific column but with a different value each time, you could define a table function which takes that value as a parameter and allows you to skip all the copying and pasting just for the sake of one line changing. You could take an extremely lengthy sub-query with multiple joins and abbreviate it to something as short as one of the examples above, and then use that in other queries. Or, if you want to update a table, but you also want to insert into another table as part of the same operation, you could combine those two steps into one table function, ensuring that users won’t forget the second part of that process.

So table functions are functions that return tables. It really is that simple, and we’re already two-thirds of the way to understanding what polymorphic table functions are. And now it’s time to add in that fun ‘polymorphic’ word.

What makes a table function polymorphic?

A polymorphic table function is a type of table function where the schema of the returned table is determined dynamically. This means that the returned table data, including its schema, can be determined by the arguments you pass to the function. And you might imagine, that makes PTFs a lot more powerful than an ordinary, run-of-the-mill table function.

What can you do with polymorphic table functions?

When you’re not determining the schema of the returned table well in advance, you get the flexibility to do some pretty crazy things. It can be as simple as adding or removing columns as part of the function, or it can be as complex as building and returning an entirely new table based on some input data.

Demo of the episode: The many ways you can leverage PTFs

But we’ve talked enough at a high level about what PTFs are, so now it’s a good time to look at what PTFs can actually do for you to make your life as a Trino user easier, better, and more efficient.

Possible polymorphic table functions

One thing to note - all the examples we’re about to look at are hypothetical. We’re working to bring functions similar to these to Trino soon, but there’s a few things left to implement before we get there, so for now, this is meant to highlight why we’re implementing PTFs, and we’ll take a look at what you can currently do with them a little later. When it does come time to implement these functions, they will not be exactly the same as you see them here.

Select except

Imagine a table with 10 columns, named col1, col2, col3, etc. If you want to select all the columns except the first one from that table, you end up with a query that looks like:

SELECT
    col2, col3, col4, col5, col6, col7, col8, col9, col10
FROM
    my.table;

But that’s long, and it’s a pain to type, and it gets messy, especially if your column names aren’t extremely short due to being part of a contrived example. With a simple PTF, you could get the same result with:

SELECT
    *
FROM
    TABLE(
        excl_function(
            data => TABLE(my.table), columns_to_exclude => DESCRIPTOR("col1")
        )
    );

Now, this isn’t a great PTF, because it’s going to take more time to implement than it takes to just write out your column names, and at least when we’re using only 10 columns and short column names, invoking the function takes more writing than doing it the old-fashioned way. Also, this is going to perform worse than writing the query the ordinary way. As a rule of thumb, if it can be written with normal SQL, it will be more performant when done that way. There are plans to work on optimizing PTFs, but that’s not going to happen soon, so for the time being, we’re focusing on how they enable things which previously couldn’t be done at all, rather than making queries look nicer or cleaner.

All that said, we wanted to include this example because this does a good job at demonstrating how polymorphic table functions can work and what they can do for you. But it’s a simple example, and now we can look at some which are a little more complex and a little more practical.

CSVreader

If you’ve ever tried to create a table from a CSV file, you know it can be a painful experience. It has to be very explicit, very diligent, and there’s a lot of manual cross-checking involved in ensuring that each column aligns perfectly and is correctly typed for the columns present in the CSV. Enter polymorphic table functions, here to save the day.

Remember, this is hypothetical, so by the time we get to implementing something similar to this in Trino, it will certainly look different. But a table function like this will be defined on the connector, so all the end user needs to worry about is what its signature might look like:

FUNCTION CSVreader (
    Filename VARCHAR(1000),
    FloatCols DESCRIPTOR DEFAULT NULL,
    DateCols DESCRIPTOR DEFAULT NULL
    )
RETURNS TABLE

One key thing to note here is the DESCRIPTOR type. It is a type that describes a list of column names, and there will be a function to convert a parameterized list to the DESCRIPTOR type. Other than that, everything else here does what you’d expect - you pass the function the name of the CSV file, the columns which should be typed as floats, and the columns which should have a date typing. All unspecified columns will still be handled as varchar. Calling the function might look something like:

SELECT
  *
FROM
    TABLE(
        CSVreader(
            Filename => 'my_file.csv',
            FloatCols => DESCRIPTOR("principle", "interest")
            DateCols => DESCRIPTOR("due_date")
        )
    );

Given a CSV with this content:

docno,name,due_date,principle,interest
123,Alice,01/01/2014,234.56,345.67
234,Bob,01/01/2014,654.32,543.21

Such a function would return a table that looks like:

docno	name	due_date	principle	interest
123	Alice	2014-01-01	234.56	345.67
234	Bob	2014-01-01	654.32	543.21

With a well-written PTF, the days of toiling over parsing a CSV into SQL are over!

Pivot

Pivot is an oft-requested feature which hasn’t been built in Trino because it isn’t a part of the standard SQL specification. A PIVOT keyword or built-in function isn’t planned, but with PTFs, we can support PIVOT-like functionality without needing to deviate from SQL.

A PIVOT PTF might have the following definition:

FUNCTION Pivot (
    Input_table TABLE PASS THROUGH WITH ROW SEMANTICS,
    Output_pivot_columns DESCRIPTOR,
    Input_pivot_columns1 DESCRIPTOR,
    Input_pivot_columns2 DESCRIPTOR DEFAULT NULL,
    Input_pivot_columns3 DESCRIPTOR DEFAULT NULL,
    Input_pivot_columns4 DESCRIPTOR DEFAULT NULL,
    Input_pivot_columns5 DESCRIPTOR DEFAULT NULL
)
RETURNS TABLE

But before we look at how you can invoke this, there’s a few clauses here that are worth explaining…

PASS THROUGH means that the input data (and all of its rows) will be fully available in the output. The alternative to this is NO PASS THROUGH.
WITH ROW SEMANTICS means that the result will be determined on a row-by-row basis. The alternative to this is WITH SET SEMANTICS.

And of course, the function takes some parameters, so a good function author defines what those parameters do.

‘Input’ is the input table. It’s any generic table.
‘Output_pivot_columns’ is the names of the columns to be created in the pivot table.
Input_pivot_columns are all the columns to be pivoted into the output columns. The first parameter is required, but you can specify more groupings. The number of input columns in a group to be pivoted and the number of output columns must be the same.

So you’ve got a PIVOT function, and you understand how to invoke it, so all you need to do is listen to Ross from Friends and make it happen:

SELECT
    D.id,
    D.name,
    P.accttype,
    P.acctvalue
FROM
    TABLE(
        Pivot(
            Input_table => TABLE (My.Data) AS D,
            Output_pivot_columns => DESCRIPTOR (accttype, acctvalue),
            Input_pivot_columns1 => DESCRIPTOR (accttype1, acctvalue1),
            Input_pivot_columns2 => DESCRIPTOR (accttype2, acctvalue2)
        )
    ) AS P;

If we presume we have this data in My.Data:

ID	Name	accttype1	acctvalue1	accttype2	acctvalue2
123	Alice	external	20000	internal	350
234	Bob	external	25000	internal	120

The output of that query will be:

ID	Name	accttype	acctvalue
123	Alice	external	20000
123	Alice	internal	350
234	Bob	external	25000
234	Bob	internal	120

You can see the PASS THROUGH clause in action when you select D.id and D.name.

ExecR

As a bonus cherry on top, and as an example of something very fun that you can do with PTFs, how about executing an entire script written in R?

A connector could provide a function with the signature:

FUNCTION ExecR (
    Script VARCHAR(10000),
    Input_table TABLE NO PASS THROUGH WITH SET SEMANTICS,
    Rowtype DESCRIPTOR
)
RETURNS TABLE

The inputs here are the script, which can simply be pasted into the query as text, the input table which contains the data for the script to run on, and then a descriptor for row typing, as there’s otherwise no way for the engine to know after running the R script. Worth pointing out and contrary to the PIVOT example, this function has NO PASS THROUGH because the R script will not have the ability to copy input rows into output rows.

Invoking this function is relatively straightforward:

SELECT
    *
FROM
    TABLE(
        ExecR(
            Script => '...',
            Input => TABLE(My.Data),
            Rowtype => DESCRIPTOR(col1 VARCHAR(100), col2 REAL, col3 FLOAT)
        )
    ) AS R;

And depending on your script and your data, you can make this as simple or as extreme as you’d like!

Pull request of the episode: PR 12325: Support query pass-through for JDBC-based connectors

We’ve spent a lot of time talking about hypothetical value that we will be able to derive from polymorphic table functions sometime down the line, but we should also pump the brakes a little and take a look at what we already have in Trino in terms of polymorphic table functions. This PR, authored by Kasia Findeisen, was the first code to land in Trino that allowed access to PTFs. It’s just one particular PTF, but it’s pretty neat, so we can jump into it with a demo and an explanation for how we’re already changing the game with PTFs.

Demo of the episode #2: Using connector-specific features with query pass-through

Trino sticks to the SQL standard, which means that custom extensions and syntax aren’t supported. If you’re using a Trino connector where the underlying database has a neat feature that isn’t a part of the SQL standard, you previously were unable to take advantage of that, and you knew it wasn’t going to be added to Trino. But now with query pass-through, you can leverage any of the cool non-standard extensions that belong to connectors! We’ll look at a couple different examples, but keep in mind, because this is pushing an entire query down to the connector, the possibilities will be based on what the underlying database is capable of.

`GROUP_CONCAT()` in MySQL

In a table where we have employees and their manager ID, but no direct way to list managers with all their employees, we can push down a query to MySQL and use GROUP_CONCAT() to combine them all into one column with this query:

SELECT
  *
FROM
  TABLE(
    mysql.system.query(
      query => 'SELECT
        manager_id, GROUP_CONCAT(employee_id)
      FROM
        company.employees
      GROUP BY
        manager_id'
    )
  );

MODEL clause in Oracle

The MODEL clause in Oracle is an incredibly powerful way to manipulate and view data. As it’s non-ANSI compliant, it’s specific to Oracle, but if you want to use it, now you can! Through polymorphic table functions, you can generate and perform sophisticated calculations on multidimensional arrays - try saying that five times fast. We don’t have the time to explain everything about how this feature works, but if you want clarification, you can check out the Oracle documentation on MODEL and try it out for yourself.

SELECT
  SUBSTR(country, 1, 20) country,
  SUBSTR(product, 1, 15) product,
  year,
  sales
FROM
  TABLE(
    oracle.system.query(
      query => 'SELECT
        *
      FROM
        sales_view
      MODEL
        RETURN UPDATED ROWS
        MAIN
          simple_model
        PARTITION BY
          country
        MEASURES
          sales
        RULES
          (sales['Bounce', 2001] = 1000,
          sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000],
          sales['Y Box', 2002] = sales['Y Box', 2001])
      ORDER BY
        country'
    )
  );

Funnily enough, Oracle also supports polymorphic table functions, so if you wanted to, you could use the query function to then invoke a PTF in Oracle, including any of the hypothetical examples we went into above! PTFs inside of PTFs are possible! …though probably not the best idea.

Question of the episode: Where are we at, and what’s coming next?

Right now, there’s a few things on the radar for moving forward with PTFs. The first and more simple task at hand is expanding the query function to other connectors. We started with the JDBC connectors, but we have also landed a similar function called raw_query for ElasticSearch, are working on a BigQuery implementation, and there may still be more yet to come.

On a broader scope, the reason this was the first PTF that was implemented is because Trino doesn’t have to do anything to make it work. The next big step in powering PTFs up is to create an operator and make the engine aware of them, so that the engine can handle and process PTFs itself, which will open the door to the wide array of possibilities we explored earlier.

And finally, once that’s done, we plan on empowering you, the Trino community, to go out and actually make some polymorphic table functions. You already can implement them today, but with those limitations: you can’t use table or descriptor arguments, and the connector has to perform the execution. But once the full framework for PTFs has been built, those examples from earlier (and many possible others) still need to be implemented. There is a developer guide on implementing table functions which exists today, but there are plans to expand it so that it’s easier to go in and add the PTFs which will make a difference for you and your workflows.

Events, news, and various links

Check out the in-person and virtual Trino Meetup groups.

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Slowikowski.

37: Trino powers up the community support

2022-06-16T00:00:00+00:00

Guests

In this episode we have the pleasure to chat with our colleagues, who now make the Trino community better every day:

Cole Bowden, Developer Advocate at Starburst
Jan Waś, Software Engineer at Starburst
Kostas Pardalis, Group Project Manager at Starburst
Monica Miller, Developer Advocate at Starburst

Releases 382 to 386

Official highlights from Martin Traverso:

Trino 382

Support for reading wildcard tables in the BigQuery connector.
Support for adding columns in the Delta Lake connector.
Support updating Iceberg table partitioning.
Improved INSERT performance in the MySQL, Oracle, and PostgreSQL connectors.
Basic authentication in the Prometheus connector.
Exchange spooling on Google Cloud Storage.

Trino 383

New json_exists, json_query, and json_value functions.
Support for table comments in the Delta Lake connector.
Support IAM roles for exchange spooling on S3.
Improved performance for aggregation queries.

Trino 384

Support for new pass-through query table function for Druid, MariaDB, MySQL, Oracle, PostgreSQL, Redshift, SingleStore and SQL Server.

Trino 385

New json_array and json_object functions.
Support for time travel syntax in the Iceberg connector.
Support for timestamp(p) type in MariaDB connector.
Performance improvements in Iceberg connector.

Trino 386

Improved performance for fault-tolerant query execution
Faster queries on Delta Lake

Additional highlights worth a mention according to Manfred:

383 had a regression, don’t use it.
As mentioned last time, exchange spooling is now supported on the three major cloud object storage systems.
Query pass-through table function is a massive feature. We are adding this to other connectors, and more details are coming in a future special episode.
Special props to Kasia for all the new JSON functions.
Phoenix 4 support is gone.

More detailed information is available in the release notes for Trino 382, Trino 383, Trino 384, Trino 385, and Trino 386.

Concept of the episode: How to strengthen the Trino community

What is community, and why has this word seen more use around technical projects, particularly those in the open-source space. There’s really no formal definition of community in the context of technology. David Spinks, author of the book, “The Business of Belonging”, defines community as:

A group of people who feel a shared sense of belonging.

For technical projects, this sense of belonging generally comes from the shared affinity towards a specific product, like Trino, or it could be a brand that hosts many products, like Google or Microsoft. There’s a lot that could be discussed here regarding why communities have become an essential ingredient to a project’s success. The quick answer I like to offer is that projects, open-source or proprietary, that have strong communities behind them innovate and grow faster, and are more successful overall.

As such, the Trino Software Foundation (TSF) recognizes that Trino will only be as successful as the health of the community that builds, tests, uses, and shares it. The activities around building a technical community fall in between engineering, marketing, and customer enablement. A common name that encompasses the individuals that work in this space is developer relations, DevRel for short. The goal of our work with the maintainers, contributors, users, and all other members of the community is the following:

Grow all aspects of the Trino project, and the Trino community to empower current and future members of the community.

We introduce some new faces who are stewards in our journey to growing the adoption of our favorite query engine, what each of them does, and how their work impacts you as a community member! Most importantly, you can learn how to get involved and help us learn how to best navigate ideas, issues, or any other contributions you may have that helps Trino to be the best query engine.

Improving the onboarding and getting started pages

We don’t really have a seamless onboarding experience for new users. Many members have asked questions on where to get started. One logical place people tend to go to when browsing on the front page of the Trino site is the getting started tab, which is ironically still on the trino.io/download.html page. When you open this page, you are brought to a page primarily containing the latest binary downloads, some community links, and some reading material to books and other resources.

The main thing you don’t really see is much getting started material. A lot of the material is intermediate level at best. There is not much beginner level guides to offer the self-service onboarding many are looking for when they just want to play around without having to bother or wait for anyone to respond. As it stands today, there is some work that Brian and Monica have started to create in this area to make the onboarding simpler.

A very common self-service getting started material is the trino-getting-started repo that Brian created to host demonstrations for the broadcast to show off some new feature or connector capabilities. This has been a good way to offer a simple environment to get them started. The only way to find this repository is to ask someone first. It would be ideal to showcase getting started materials as part of the default experience of learning about Trino.

Monica is working now on building up some demos using SaaS products like Starburst Galaxy as another method of using Trino without needing to install Docker among having to use any of your hardware to run through some examples. These options are typically more UI driven and much more approachable by members of the community that aren’t engineers or administrators.

Release process

Filling out a pull request

We’ve got a handy PR template that exists for all contributors to use when they’ve submitting a pull request to Trino. Most of it is simple and self-explanatory. We ask you to describe what’s happening, where the change is happening, and what type of change it is. These are for the sake of the reviewers, giving them some important context so they understand what’s going on when they review the code. For simpler changes, it’s not usually necessary to go into a ton of detail here, but it’s nice to give a little summary for anyone looking at the PR.

The next steps are what really matter for every single PR that’s going to be merged - the documentation and release notes for a change. These are about communicating to our users. Documentation refers to Trino docs, not code comments. If Trino users need to be told how to use the feature you’re changing because of how you’re changing it, that means we need to have documentation for it. The PR template gives the options for how to go about this, but it’s incredibly helpful to have this filled out. Similarly, we ask whether or not release notes are necessary for the change, and what release notes you propose for your change. Generally speaking, if it needs to be documented, it almost always should have a release note. Even if it isn’t documented, a release note is often a good idea - things like performance improvements don’t require our users to change how they use Trino, but they won’t mind knowing that something has gotten better! The release process involves heavy editing of release notes, so it’s ok for the suggested note to be imperfect.

What is developer experience (DevEx)?

Trino is a technology that is built by developers, but also heavily used by developers. We want to ensure that the experience of both contributors and users of Trino is the best possible. To do that, we have to focus on many different aspects of this experience, from committing code to the CLIs and tools we offer for debugging queries and most importantly to building a sustainable community that can give answers and drive the future of the project. This is what DX is for Trino.

Community metrics

A while ago we started gathering metrics related to the Trino GitHub repository. This helped us identify issues like huge CI queue times. Most importantly we. can verify that the changes we made improved things, and how much.

In February this year, the 95th percentile of the CI queue time (not even the total run time!) was as high as almost 7 hours. Trino uses public GitHub runners, and there can only be 60 jobs running concurrently at the same time. This is a bottleneck because Trino has extensive test coverage for the core engine, all connectors, and other plugins. Because we can’t increase the number of runners, we looked into doing impact analysis to skip tests for modules not impacted by any change in a pull request.

Since April, the 95th percentile of the CI queue time is under 1 hour, even though the number of contributions is at an all-time high.

We keep track of these selected metrics in reports we create by running queries using the Trino CLI, saving the results in a markdown file, and publishing them as static pages using GitHub pages. The data is gathered using Trino connectors for the GitHub API and Git repositories. There’s a GitHub actions workflow running on a schedule, that spins up a Trino server, so there’s no infrastructure to maintain, except for a single S3 bucket. All of it is publicly available in the nineinchnick/trino-cicd repository. On the right, there’s a link to GitHub pages with reports.

We continue to add more reports, like tracking flaky tests or pull request activity:

By being data-driven and transparent, we make sure to provide a good experience for everyone, and this also helps us figure out where we need more resources to focus on.

We’re open to suggestions on what to track and which metrics to report on, so feel free to open issues and pull requests in the repository mentioned above, or start a thread on the Trino Slack.

Pull request triage

One of the things we’ve been tracking over the last couple weeks has been the state of incoming PRs. We want to make sure that each PR reaches a maintainer, and that they all receive timely feedback after asking for a review. The goal in looking into this process is to help streamline and improve the time-to-initial-comment. The pleasant discovery is that it doesn’t seem like we have a lot of room to improve on that front. Not to pat ourselves on the back too heavily, but PRs find their way to maintainers, and get an initial review quite quickly, and there’s little work to be done on that front.

Our next exploration is tracking PRs that don’t quickly get approved and merged, and monitoring their life cycle and making sure follow-up reviews are happening in a timely manner as well. We now know that we are effective at giving initial feedback on a PR, but we also want to make sure that these PRs aren’t falling off a cliff or turning into a long, drawn-out process where each development iteration is slower than the last.

Pull requests of the episode: PR 12259: Support updating Iceberg table partitioning

This months PR of the episode was contributed by alexjo2144. This feature is an exciting update on the ability to modify the partition specification of a table in Iceberg. This is an update since Brian wrote about this feature

At the time of writing, Trino is able to perform reads from tables that have multiple partition spec changes but partition evolution write support does not yet exist.

This brings us much closer to having more feature parity with other query engines to manage Iceberg tables entirely through Trino. Thanks to our friend Marius Grama findinpath for the review.

Demo of the episode: Iceberg table partition migrations

For this episode’s demo, you’ll need a local Trino coordinator, MinIO instance, and Hive metastore backed by a database. Clone the trino-getting-started repository and navigate to the iceberg/trino-iceberg-minio directory. Then start up the containers using Docker Compose.

git clone git@github.com:bitsondatadev/trino-getting-started.git

cd iceberg/trino-iceberg-minio

docker-compose up -d

This demo is actually very similar to a demo we did in episode 15, except now we get to showcase one of Iceberg’s most exciting features, partition evolution.

/**
 * Make sure to first create a bucket names "logging" in MinIO before running
 */

CREATE SCHEMA iceberg.logging
WITH (location = 's3a://logging/');

CREATE TABLE iceberg.logging.logs (
   level varchar NOT NULL,
   event_time timestamp(6) with time zone NOT NULL,
   message varchar NOT NULL,
   call_stack array(varchar)
)
WITH (
   format = 'ORC',
   partitioning = ARRAY['day(event_time)']
);

/**
 * Inserting two records. Notice event_time is on the same day but different hours.
 */

INSERT INTO iceberg.logging.logs VALUES 
(
  'ERROR', 
  timestamp '2021-04-01 12:23:53.383345' AT TIME ZONE 'America/Los_Angeles', 
  '1 message',
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
),
(
  'ERROR', 
  timestamp '2021-04-01 13:36:23' AT TIME ZONE 'America/Los_Angeles', 
  '2 message', 
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
);

SELECT * FROM iceberg.logging.logs;
SELECT * FROM iceberg.logging."logs$partitions";

/**
 * Notice one partition was created for both records at the day granularity.
 */

/**
 * Update the partitioning from daily to hourly 🎉
 */
ALTER TABLE iceberg.logging.logs 
SET PROPERTIES partitioning = ARRAY['hour(event_time)'];

/**
 * Inserting three records. Notice event_time is on the same day but different hours.
 */
INSERT INTO iceberg.logging.logs VALUES 
(
  'ERROR', 
  timestamp '2021-04-01 15:55:23' AT TIME ZONE 'America/Los_Angeles', 
  '3 message', 
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
), 
(
  'WARN', 
  timestamp '2021-04-01 15:55:23' AT TIME ZONE 'America/Los_Angeles', 
  '4 message', 
  ARRAY ['bad things could be happening']
), 
(
  'WARN', 
  timestamp '2021-04-01 16:55:23' AT TIME ZONE 'America/Los_Angeles', 
  '5 message', 
  ARRAY ['bad things could be happening']
);

SELECT * FROM iceberg.logging.logs;
SELECT * FROM iceberg.logging."logs$partitions";

/**
 * Now there are three partitions:
 * 1) One partition at the day granularity containing our original records.
 * 2) One at the hour granularity for hour 15 containing two new records.
 * 3) One at the hour granularity for hour 16 containing the last new record.
 */

SELECT * FROM iceberg.logging.logs 
WHERE event_time < timestamp '2021-04-01 16:55:23' AT TIME ZONE 'America/Los_Angeles';

/**
 * This query correctly returns 4 records with only the first two partitions
 * being touched. 
 */

There’s been a lot of cool things going into the Iceberg connector these days, and another exciting one that came out in release 381 was the support for UPDATE in Iceberg. So we’re gonna showcase that:

/**
 * Update
 */
UPDATE
  iceberg.logging.logs
SET
  call_stack = call_stack || 'WHALE HELLO THERE!'
WHERE
  lower(level) = 'warn';

DROP TABLE iceberg.logging.logs;

DROP SCHEMA iceberg.logging;

Question of the episode: Can I force a pushdown join into a connected data source?

Full question from Trino Forum

Is there a way to “quote” a sub query, to tell the Trino planner just pushdown the query and don’t bother making a sub plan?

I have a star schema, with one huge table (>100M rows) and a dimension table that has static attributes of the huge table. The dimension table is filtered to create a map, that is joined to the huge table. The result is group by on a dimension and finally some of the metrics from the huge table are aggregated to calculate stats.

Answer: We’ve recently introduced Polymorphic Table Functions to Trino in version 381.

In version 384, which was just released a few days ago, the query table function was added in PR 12325.

For a quick example in MySQL:

trino> USE mysql.tiny;
USE
trino:tiny> SELECT * FROM TABLE(system.query(query => 'SELECT 1 a'));
a
---
1
(1 row)

trino:tiny> SELECT * FROM TABLE(system.query(query => 'SELECT @@version'));
@@version
-----------
8.0.29
(1 row)

So this will run exactly the command on the underlying database (not exactly a pushdown but a pass-through) and return the results to Trino as a Table. SELECT @@version is MySQL specific syntax that returns MySQL output as a table that now Trino is able to further process.

Events, news, and various links

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

36: Trino plans to jump to Java 17

2022-05-19T00:00:00+00:00

Releases 379 to 381

Official highlights from Martin Traverso:

Trino 379

New MariaDB connector
Performance improvements for JOIN, UNION, and GROUP BY
Support for Google Cloud Storage in the Delta Lake connector
Support for Pinot 0.10

Trino 380

Update Cassandra connector to support v5 and v6 protocols.
Rename properties controlling Hive view parsing.
Allow changing file and table format with the Iceberg connector.
Add support for bulk data insertion in SQL Server connector.

Trino 381

Support for UPDATE in Iceberg connector.
Experimental support for table functions.
Support for exchange spooling on Azure Blob Storage.
Support reading snapshot tables and materialized views in BigQuery connector.

Additional highlights worth a mention according to Manfred:

Next is exchange spooling on Google Cloud Storage.
Framework for table functions is in place, implementations in connectors are coming.
ldap.ssl-trust-certificate as legacy config removes upgrade failures.
Introduce the least-waste low memory task killer policy.
Disable auto-suggestion in CLI

More detailed information is available in the release notes for Trino 379, Trino 380, and Trino 381.

Cinco de Trino recap blog post

Check out this blog post that details all the cool talks that took place at Cinco de Trino and includes video resources. This was a mini version of the Trino Summit which will take place later this year.

Question of the episode: Will Trino be making a vectorized C++ version of Trino workers?

Full question from Trino Slack

Answer: Writing a C++ worker would require each plugin to be implemented in C++ as well. However, you don’t need C++ for vectorization. Java already does a technique called auto-vectorization which we will demonstrate later in the show! Java 17 also introduces the new Vector API which unlocks complex usage patterns that we can invest in moving forward. However, there’s so much more to making operations faster than just bare metal speed that we are going to focus on.

To demonstrate this, I’d like to use an analogy about how I think of this. Comparing C++ and Java implementation is like comparing the two fastest men in the world. Usain Bolt holds the most world records for mens track to this date, and teammate Yohan Blake holds many of the second place titles. Most of us know Usain Bolt is the fastest of the two, and you may not have known or remembered Yohan’s name before. Want to hear something crazy, Yohan has beaten Usain Bolt in a few races. The two are so close in speed, it’s seconds to milliseconds difference. The main difference in this analogy is that speed is the only thing that matters in an olymic race. Howver, programming languages and frameworks have a lot more tradeoffs.

The point is, Java is fast and more importantly, it removes a lot of burden maintaining and scaling out the code. This is conducive to a healthy open-source project, and lowers the barrier for collaboration. Rather than go against this and take on the feat of having to rewrite an entire system in C++, why not lean into the incredible innovation recent Java features have to offer to improve performance even more.

Another important aspect is rather than chasing the fastest bare metal speed, it’s also incredibly important to dedicate time into ensuring that Trino’s optimizer is producing the best possible plans to avoid doing unnecessary work. To continue with the analogy, in a 100m race on a 400m track, imagine we have Usain and Yohan go head to head. We may expect that Usain will likely win, given his track record. However, if Usain is given the wrong instructions and runs in the wrong direction (300m), my bets are that Yohan will win the race.

In essence, the direction of Trino while still including bare metal performance improvements in the JVM, will instead focus on not wasting time with suboptimal query plans before or during runtime. There are so many optimizations that are constantly being added to every release that ultimately makes for a work-smarter-not-harder query engine.

Concept of the episode: Java 17 and rearchitecting Trino

As Trino prepares to update to Java 17, we wanted to give a glimpse at what has happened between the current required JDK version, JDK 11, and future version JDK 17. Both of these versions are long-term support versions, and in the four years from 11 to 17 a lot of exciting improvements were added.

Java 17 updates

Here are some updates coming up in Java 17.

Performance

There were several JDK Enhancement Proposals (JEP) that improve performance as well as many small changes to the JVM:

Performance is a multifaceted topic that includes factors like throughput, latency, memory footprint, startup, ramp up, pause times, and shut down time.

You can used standardized benchmarks like SPECjbb® 2015 to test a Java application in most of these performance factors. Aside from the formalized benchmarks, it’s interesting to see the Java community come up with microbenchmarks to test relative speedups of JVMs on their own applications. This user benchmark found an 8.66% improvement in speed when using hte G1 garbage collector. They isolated modules of their application to measure each microbenchmark separately.

Martin did a similar test late last year, and reported anywhere from 10-15% improvement in speed in Java 17 using the G1 garbage collector. This is an exciting development and we hope to publish more about this as we get closer to updating.

Garbage collectors

Although garbage collectors are performance enhancements in their own right, there is a lot of exciting changes around garbage collectors in Java 17 since Java 11 which earns garbage collectors their own section.

First not one, but two concurrent garbage collectors have made their way out of incubation, and are ready for use.

Aside from that, there are a bunch of big improvements to G1.

In a fantastic writeup and benchmark by Stefan Johansson, they ran the SPECjbb® 2015 to evaluate the improvements of different garbage collectors over different LTS versions.

Source: Stefan Johansson's Blog

Pay attention to this chart, as it showcases the advantage of having a concurrent garbage collector like ZGC or Shenandoah that doesn’t interfere with your application code. It’s incredible that 99% of the GC operations only took 0.1ms. Wild!

Source: Stefan Johansson's Blog

Take particular note of the massive improvement of G1. This is especially exciting because G1 is recommended for Trino usage. It’s still too early to determine if ZGC or Shenendoah will have overall better performance depending on the context in which the JVM is running. One thing to look forward to is the incredible drop in memory footprint over the different versions!

Source: Java YouTube Channel

Vector API (2nd incubator status)

One available capability that is still incubating is the Vector API. Trino currently takes advantage of the auto-vectorization that comes for free when the compiler detects that a loop like this one used from Daniel Strecker’s auto-vectorization blog:

/**
 * Run with this command to show native assembly:<br/>
 * Java -XX:+UnlockDiagnosticVMOptions
 * -XX:CompileCommand=print,VectorizationMicroBenchmark.square
 * VectorizationMicroBenchmark
 */
public class VectorizationMicroBenchmark {

    private static void square(float[] a) {
        for (int i = 0; i < a.length; i++) {
            a[i] = a[i] * a[i]; // line 11
        }
    }

    public static void main(String[] args) throws Exception {
        float[] a = new float[1024];

        // repeatedly invoke the method under test. this
        // causes the JIT compiler to optimize the method
        for (int i = 0; i < 1000 * 1000; i++) {
            square(a);
        }
    }
}

Without auto-vectorization, a command vmulss (multiply scalar single-precision) versus with auto-vectorization the vmulps (multiply packed single-precision) which is a SIMD instruction the JIT compiler updated for us without manual intervention.

However, this isn’t always so straightforward to detect. As you can see from the comments in the example, special criteria need to be met. For this, you can use the Vector API to directly interface with SIMD and GPU instructions. We will show more on this in the demo.

Language features

Beyond the performance improvements, Java 17 includes some exciting new Java language updates and improvements. While some may not consider this as exciting as performance boosts, language enhancements make it easier to write higher quality and maintainable code. This is especially important for an open source project that is maintained by many individuals.

A very useful change for Trino is the new support for multiline text blocks. This allows you to go from having to write a SQL query represented in a one-dimensional string literal like this:

  String query = "SELECT \"emp_id\", \"last_name\" FROM \"employee\"\n" +
                 "WHERE \"city\" = 'Indianapolis'\n" +
                 "ORDER BY \"emp_id\", \"last_name\";\n";

to a much more readable two-dimensional string block like this:

  String query = """
                 SELECT "emp_id", "last_name" FROM "employoee"
                 WHERE "city" = 'Indianapolis'
                 ORDER BY "emp_id", "last_name";
                 """;

The new switch expressions remove the difficult-to-read syntax of switches that led to many bugs and confusing code in the past. Particularly the ambiguity of the break; statement logic:

  switch (day) {
      case MONDAY:
      case FRIDAY:
      case SUNDAY:
          System.out.println(6);
          break;
      case TUESDAY:
          System.out.println(7);
          break;
      case THURSDAY:
      case SATURDAY:
          System.out.println(8);
          break;
      case WEDNESDAY:
          System.out.println(9);
          break;
  }

is made much easier to reason about using a functional clause to define the correct code to execute for a set of labels:

  switch (day) {
      case MONDAY, FRIDAY, SUNDAY -> System.out.println(6);
      case TUESDAY                -> System.out.println(7);
      case THURSDAY, SATURDAY     -> System.out.println(8);
      case WEDNESDAY              -> System.out.println(9);
  }

Always having to do a cast after checking for a type has always been an annoyance to many Java developers. Pattern Matching for instanceof makes this go away. Look at this example you may be familiar with:
```
  if (obj instanceof String) {
      String s = (String) obj;    // grr...
      ...
  }
```
Now imagine, you don’t have to have a cast statement for every one of these laying around in your codebase:
```
  if (obj instanceof String s) {
      // Let pattern matching do the work!
      ...
  }
```
Helpful NullPointerExceptions are particularly exciting as the ever confusing nulls for no reason don’t come up, and require you to chase down where it happened in the code. Instead there is new information added to the message that ideally gives you a more unique message.

Rearchitecting Trino

With all these exciting changes, what does this mean for Trino? Let’s first dive into the thing that many of our users dread…upgrading.

Upgrade to Java 17 (When it’s time)

As mentioned before, Java 17 is the current LTS version, following Java 11. Java 17 provides significant improvements that we outlined before. We believe that once we update, everyone should be running version 17 to get the best experience out of Trino. Moving to Java 17 allows us to take advantage of many improvements to the JDK and the Java language that were introduced since Java 11. There are some reasons people say they can’t update.

Updating Java in all the clients and code that calls Trino is tedious.

You luckily only need to update the server that Trino is running on. The client or CLI can still run any version of Java.
There are conflicting Java versions on the node Trino servers run.

If you are running another application depending on Java you shouldn’t be. Ideally Trino runs on its own servers. If there’s a smaller application to, for example, monitor Trino, then you should be able to install a separate version of Trino.
There is a company policy requiring specific JDKs be installed on all servers.

You can have side-by-side installs of multiple versions of the JDK and use the appropriate one. You just need to launch Trino with the correct Java
command. If your company is against using a newer JDK, you can point out the arguments above to update the policy to at least include JDK17.

Iterating and improving Trino

We’re also in the process of revamping the core execution engine, which enables us to implement the following improvements:

Perform adaptive evaluation of expressions based on runtime cost.
Specialize evaluation for different data encodings (rle, dictionary, etc).
Implement tighter evaluation loops that make it easier for the VM to vectorize automatically and generate better machine code.
Implement evaluation of certain operations more efficiently by taking advantage of SIMD or GPU-based processing.
Columnar evaluation.

Project Hummingbird

Just as we did with the efforts around Project Tardigrade we want to centralize these efforts under a project name that includes a set of motivated community members and give it a cool name.

After some discussion, we would like to announce *Project Hummingbird* is the new banner for the efforts around improving performance and concentrated updates to the core of Trino.

We chose hummingbirds as mascots because they are adaptive, light, and fast. Hummingbirds are the only birds with the incredible capability to fly in any direction and are super fast. It made sense as Trino evolves into a query engine that is capable of adapting to its environment during query runtime, it is akin to these agile and beautiful creatures.

Vectorization is not a silver bullet

There are many ways to parallelize the operations that we run on the Trino server. There’s inter-node parallelization which split data to be operated on across nodes. There’s intra-node parallelization, which generally refers to multithreading across a CPU.

As we start to move towards vectorizations, we start to become hardware dependent and just like with any other hardware setting, your mileage may vary depending on the limitations of the resources Trino is running on.

Further, any time parallelization is applied, there is generally some overhead to coordinate lookups, shuffling more data across processors, etc..

Pull requests of the episode: PR 4649: Disable JIT byte code recompilation cutoffs in default jvm.config

This episodes pull request was added by Shubham Tagra to increase the amount of memory needed to avoid JIT recompilation cutoffs for large methods in the JVM. If these limits are hit, the JIT compiler calls an uncommon_trap to deoptimize the code. If the function is continually retried, continuous deopt or a “deopt storm” can occur, and can cause a large CPU loss. The handling of this is actually a bug in the JVM so this pull request provided a workaround.

-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000

This had been reported by multiple companies from Comcast to Shopify that had these “random slowness” issues that were resolved when these JVM settings were added.

Demo of the episode: FizzBuzz - SIMD style!

Today I’m stealing, no wait, borrowing a project created by our friend Gunnar Morling. This showcases the well known FizzBuzz game, but programmatically generates the resulting patterns from the game.

Make sure you install JDK 17 before running this code.

git clone git@github.com:bitsondatadev/simd-fizzbuzz.git

mvn clean verify

java --add-modules=jdk.incubator.vector -jar target/benchmarks.jar -f 1 -wi 5 -i 5

Events, news, and various links

Blogs and Documentation

Videos

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

35: Packaging and modernizing Trino

2022-04-21T00:00:00+00:00

Releases 375 to 378

Official highlights from Martin Traverso:

Trino 375

Support for table comments in the MySQL connector.
Improved predicate pushdown for PostgreSQL.
Performance improvements for aggregations with filters.

Trino 376

Better performance when reading Parquet data.
Join pushdown for MySQL.
Aggregation pushdown for Oracle.
Support table and column comments in ClickHouse connector.
Support for adding and deleting schemas in Accumulo connector.
Support system truststore in CLI and JDBC driver.
Two-way TLS/SSL certificate validation with LDAP authentication.

Trino 377

Add support for standard SQL trim syntax.
Better performance for Glue metastore.
Join pushdown for SQL Server connector.

Trino 378

New to_base32 and from_base32 functions.
New expire_snapshots and delete_orphan_files table procedures for Iceberg.
Faster planning of queries with IN predicates.
Faster query planning for Hive, Delta Lake, Iceberg, MySQL, PostgreSQL, and SQL Server connectors.

Additional highlights worth a mention according to Manfred:

Generally lots of improvements on Hive, Delta Lake, Iceberg, and main JDBC-based connectors.
Full Iceberg v2 table format support for read and later read and write operations is getting closer and closer.
Table statistics support for PostgreSQL, MySQL, and SQL Server connector including automatic join pushdown.
Fix failure of DISTINCT .. LIMIT operator when input data is dictionary encoded.
Add new page to display the runtime information of all workers in the cluster in Web UI.
Remove user property requirement in JDBC driver.
Require internal-communication.shared-secret value with authentication usage, breaking change for many users that have not set that secret.

More detailed information is available in the release notes for Trino 375, Trino 376, Trino 377, and Trino 378.

Concept of the episode: Packaging Trino

To adopt Trino you typically need to run it on a cluster of machines. These can be bare metal servers, virtual machines, or even containers. The Trino project provides a few binary packages to allow you to install Trino:

tarball
rpm
container image

All of them include a bunch of Java libraries that constitute Trino and all the plugins. As a result there are only a few requirements. You need a Linux operating system, since some of the libraries and code require Linux indirectly, and a Java 11 runtime.

Beyond that is just the bin/launcher script, which is highly recommended, but not required. It can be used as a service script or for manual starts/stop/status of Trino, and only needs Python.

Tarball

The tarball, is a gz compressed tar archive. For installation you just need to extract the archive anywhere. It contains the following directory structure.

bin, the launcher script and related files
lib, all globally needed libraries
plugins, connectors and other plugins with their own libraries each in separate sub-directories

You need to create the etc directory with the needed configuration, since the tarball does not include any defaults, and you can not start the application without those.

etc/catalog/*.properties
etc/config.properties
etc/jvm.config
etc/log.properties
etc/node.properties

Note that all the files are within the created directory.

RPM

The RPM archive is suitable for RPM-based Linux distributions, but testing is not very thorough across different versions and distributions.

It adapts the tarball content to the Linux file system hierarchy, hooks the launcher script up as daemon script, and adds default configuration files. That allows you to start Trino after installing the archive, as well as with system restarts.

Locations used are /etc/trino, /var/lib/trino, and others. These are configured via the launcher script parameters.

In a nutshell the RPM adds some convenience, but narrows down the supported Linux distributions. It still requires Java and Python installation and management.

Container image

The container image for Trino adds the necessary Linux, Java, and Python, and adapts Trino to the container setup.

The container adds even more convenience, since it is ready to use out of the box. It allows usage on Kubernetes with the help of the Helm charts, and includes the required operating system and application parts automatically.

Customization

All three package Trino ships are just defaults. They all require further configuration to adapt Trino to your specific needs in terms of hardware, connected data sources, security configuration, and so on. All of these can be done manually or with many existing tools.

However, you can also take it a step further and create your own package suited to your needs. The tarball can be used as source for any customization to create your own package. In the following is a list of options and scenarios:

Use the tarball, but remove unused plugins.
Use the tarball as source to create your own specific package. For example a deb archive for usage with Ubuntu, or an Alpine package for that same distro.
Create your own RPM similar to Manfred’s proof of concept that pulls out the Trino RPM package creation into a separate project.
Create your own container image with different base distro, custom set of plugins, and even with all your configuration baked into the image.

Others

You can also use brew on MacOS, but that is not suitable for production usage. More for convenience to get a local Trino for playing around.

Additional topic of the episode: Modernizing Trino with Java 17

Currently Java 11 is required for Trino. Java 17 is the latest and greatest Java LTS release with lots of good performance, security, and language improvements. The community has been working hard to make Java 17 support a reality. At this stage core Trino fully supports Java 17. Starburst Galaxy for example uses Java 17.

The maintainers and contributors would like to move to fully support and also require Java 17 soon. Here is where your input comes in, and we ask that you let us know your thoughts about questions such as the following:

Are you looking forward to the new Java 17 language features and other improvements as a contributor to Trino?
Are you already using Java 17 with Trino? In production or just testing?
If we require Java 17 in the next months, can you update to use Java 17 with Trino?
If not, what are some of the hurdles?
Are you okay with staying at an older release, until you can use Java 17?

Let us know on the #dev channel on Trino Slack or ping us directly. You can also chime in on the roadmap issue.

Pull requests of the episode: Worker stats in the Web UI

The PR of the episode was submitted Github user whutpencil, and adds a significant new feature to the web UI. It exposes the system.runtimes.nodes information, so statistics for each worker, in brand new pages. What a great effort! Special thanks also go out to Dawid Adamek dedep for the review.

Demo of the episode: Tarball installation and new Web UI feature

In the demo of the month Manfred shows a worker installation to add to a local tarball install of a coordinator, and then demos the Web UI with the new feature from the pull request of the month.

Question of the episode: Are write operations in Delta Lake supported for tables stored on HDFS?

Full question from Slack: I was trying the Delta Lake connector. I noticed that write operations are supported for tables stored on Azure ADLS Gen2, S3 and S3-compatible storage. Does that mean write operations are not supported for tables stored on HDFS?

Answer: HDFS is always implicitly supported for data lake connectors. It isn’t called out because it is assumed.

The confusion actually came from an error message used when the user tried to insert into a Delta Lake table they created in Spark. Then they tried inserting a record into the table through IntelliJ IDEA and received the following error message:

Unsupported target SQL type: -155

They thought the problem might be the wrong data type of birthday. Then used statement below to insert a record into the table.

INSERT INTO
  presto.people10m (id, firstname, middlename, lastname, gender, birthdate, ssn, salary)
VALUES (1, 'a', 'b', 'c', 'male', timestamp '1990-01-01 00:00:00 +00:00', 'd', 10);

However, I got an error message like this:

Query 20220419_031201_00015_8qe76 failed:
Cannot write to table in hdfs://masters/presto.db/people10m; hdfs not supported

This was an issue on the IntelliJ client.

Events, news, and various links

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

34: A big delta for Trino

2022-03-17T00:00:00+00:00

Guests

In this episode Manfred has the pleasure to chat with two colleagues, who are working on making Trino better every day:

Claudius Li, Product Manager at Starburst
Joe Lodin, Information Engineer at Starburst

Brian is out to add another member to his family!

Releases 372, 373, and 374

Official highlights from Martin Traverso:

Trino 372

New trim_array function.
Support for reading ZSTD-compressed Avro files.
Support for column comments in Iceberg.
Support for Kerberos authentication in Kudu connector.

Trino 373

New Delta Lake connector.
Improved performance of LIKE when querying Elasticsearch and PostgreSQL.
Improved performance when querying partitioned Hive tables.
Support access to S3 via HTTP proxy.

Trino 374

Faster GROUP BY queries.
Vim/Emacs editing mode for CLI.
Support for TRUNCATE TABLE in Cassandra connector.
Support uint types in ClickHouse.
Support for Glue Metastore in Iceberg connector.
Add CREATE/DROP SCHEMA, table and column comments in MongoDB
Improved pushdown for PostgreSQL

Additional highlights from Manfred

Timeout configuration for LDAP authentication.
Values related to fault-tolerant execution in Web UI.
JDBC Driver.getProperties enables more client applications like DBVisualizer.
Vi and Emacs editing modes for interactive CLI usage.
Performance improvements in PostgreSQL connector.
SingleStore JDBC driver usage, end of memsql name.
Documentation for the atop connector.

More detailed information is available in the Trino 372, Trino 373, and Trino 374 release notes.

Project Tardigrade update

The team around Project Tardigrade joined us in episode 32 to talk about fault tolerant execution of queries in Trino. Now they have posted a status update on our blog.

It looks like things are really coming along well, and Joe has joined the effort to create a first user-facing documentation set.

The team has also posted a status update on the #project-tardigrade Slack channel. Everything is ready for the community to perform first real world testing, and help us make this a great feature set for Trino.

Concept of the episode: A new connector for Delta Lake object storage

It is great to have a new connector in Trino, but what does that even mean? Let’s find out.

What is a connector?

Just a quick refresher. Trino allows you to query many different data sources with SQL statements. You enable that by creating a catalog that contains the configuration to connect to a specific data source. The data source can be a relational database, a NoSQL database, and an object storage. A connector is the translation layer that maps the concepts in the data source to the Trino concepts of schema, tables, rows, columns, data types and so on. The connector needs to know how to retrieve the data itself from data source, and also how to interact with the metadata.

Here are some examples metadata questions to answer:

What are the available tables in schema xyz?
What columns does table abc have and what are the data types?
What file format is used by the storage for table efg?

And some queries about the actual data:

Give me the top 100 rows from table A.
Give me all files in partition x in the directory y.

So having a connector for your data source in Trino is a big deal. A connector unlocks the data to all your SQL analytics powered by Trino, and the underlying data source doesn’t even have to support SQL.

What is Delta Lake?

Delta Lake is an evolution of the Hive/Hadoop object storage data source. It is an open-source storage format. Data is stored in files, typically using binary formats such as Parquet or ORC. Metadata is stored in a Hive Metastore Service (HMS).

Delta Lake supports ACID transactions, time travel, and many other features that are lacking in the legacy Hive/Hadoop setup. This combination of traditional data lake storage with data warehouse features is often called a lake house.

History of the new connector

Delta Lake is fully open source, and part of the larger enterprise platform for a lake house offered by Databricks. Starburst has supported Delta Lake users with a connector for Starburst Enterprise for nearly two years. To foster further adoption and innovation with the community, the connector was donated to Trino 373 and continues to be improved.

Pull requests of the episode: Add Delta Lake connector and documentation

Over 25 developers helped Jakob with the effort to open-source the connector. It is a heavy lift to migrate a such a full featured connectors into Trino. By comparison the documentation was easy, but it is very important to enable you. Well done everyone!

Let’s have a look at the code in a bit more detail. A couple of key facts:

The Delta Lake connector is just another plugin like all other connectors.
This is a feature-rich connector supporting read and write operations.
It shares implementation details with Hive and Iceberg connectors such as HMS access, Parquet and ORC file readers, and so on.

Demo of the episode: Delta Lake connector in action

Now let’s have a look at all this in action. In the demo Claudius uses docker-compose to start up a HMS as metastore, MinIO as object storage, and of course Trino as the query engine.

If you want to follow along, all resources used for the demo are available on our getting started repository.

Here is the sample catalog delta.properties:

connector.name=delta-lake
hive.metastore.uri=thrift://hive-metastore:9083
hive.s3.endpoint=http://minio:9000
hive.s3.aws-access-key=minio
hive.s3.aws-secret-key=minio123
hive.s3.path-style-access=true
delta.enable-non-concurrent-writes=true

Once everything is up and running we can start playing.

Verify that the catalog is available:

SHOW CATALOGS;

Check if there are any schemas:

SHOW SCHEMAS FROM delta;

Lets create a new schema:

CREATE SCHEMA delta.myschema WITH (location='s3a://claudiustestbucket/myschema');

Create a table, insert some records, and then verify:

CREATE TABLE delta.myschema.mytable (name varchar, id integer);
INSERT INTO delta.myschema.mytable VALUES ( 'John', 1), ('Jane', 2);
SELECT * FROM delta.myschema.mytable;

Run a query to get more data and insert it into a new table:

CREATE TABLE delta.myschema.myothertable AS
  SELECT * FROM delta.myschema.mytable;

SELECT * FROM delta.myschema.myothertable ;

Now for some data manipulation:

UPDATE delta.myschema.myothertable set name='Jonathan' where id=1;
SELECT * FROM delta.myschema.myothertable ;
DELETE FROM delta.myschema.myothertable where id=2;
SELECT * FROM delta.myschema.myothertable ;

And finally, lets clean up:

ALTER TABLE delta.myschema.mytable EXECUTE optimize(file_size_threshold => '10MB');
ANALYZE delta.myschema.myothertable;
DROP TABLE delta.myschema.myothertable ;
DROP TABLE delta.myschema.mytable ;
DROP SCHEMA delta.myschema;

As you can see with Trino and Delta Lake you get full create, read, update, and delete operations on your lake house.

Question of the episode: How do I secure the connection from a Trino cluster to the data source

Since we talked about connectors earlier, you already know that the configuration for accessing a data source is assembled to create a catalog. This approach uses a properties file in etc/catalog. For example, let’s look at the recently updated SQL Server connector documentation:

connector.name=sqlserver
connection-url=jdbc:sqlserver://<host>:<port>;database=<database>;encrypt=false
connection-user=root
connection-password=secret

The connector uses username and password authentication. It connects using the JDBC driver, which in turn enables TLs by default. A number of other connectors also use JDBC drivers with username and password authentication, but the details vary a lot. However, for all of them you can use secrets support in Trino to use environment variable references instead of hardcoding passwords.

When it comes to other connectors the details of securing a connection vary even more. Ultimately the answer to how to secure the connection, and if that is even possible, is the usual “It depends”. Luckily you can check the documentation for each connector to find out more and ping us on Slack if you need more help.

Events, news, and various links

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from O’Reilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

33: Trino becomes highly available for high demand

2022-02-17T00:00:00+00:00

Guests

Ramesh Bhanan, Vice President, at Goldman Sachs (@ramesh-bhanan-byndoor).
Sambit Dikshit, Managing Director, Tech Fellow at Goldman Sachs (@sambitdixit).
Siddhant Chadha, Senior Data Engineer at Goldman Sachs (@siddhant-chadha).
Suman Baliganahalli Narayan Murthy, Vice President at Goldman Sachs (@suman-b-n).
Sumit Halder, Vice President at Goldman Sachs (@sumit-halder).

Releases 369, 370, and 371

Trino 369

Experimental support for task level retries.
Support for groups in OAuth2 claims.
Column comments in ClickHouse connector.
Write Bloom filters in ORC files.
Procedure for optimizing Iceberg tables.

Trino 370

Add CLI support for ARM64.
Improved performance for ORC.
Improved performance for map and row types.
Reduced latency for OAuth2.0 authentication.

Trino 371

Support for secrets and user group selector in resource group manager.
Support AWS role session name in S3 security mapping configuration.
Many bug fixes.

Notes from Manfred

Add support for using PostgreSQL and Oracle as backend database for resource groups.
Remove spill-order-by, spill-window-operator, and query.max-total-memory-per-node.
Add support for ALTER MATERIALIZED VIEW ... SET PROPERTIES in the engine.
Prevent hanging query execution on failures with phased execution policy.
Support for renaming schemas in PostgreSQL and Redshift connectors.
Lots of improvements on Clickhouse connector, thanks Yuya!
Update to newer ClickHouse version removed support for Altinity 20.3.
$properties table and other hidden tables in Iceberg connector, including docs.
Automatically adjust ulimit setting when using the RPM package.
Docker images changes to UBI.
Remove support/need for allow-drop-table catalog property in JDBC connectors.
A bunch of SPI changes.
DML with Iceberg connector with fault tolerant mode and more Tardigrade improvements.
Drop support for Kudu 1.13.0.

More detailed information is available in the Trino 369, Trino 370, and Trino 371 release notes.

Concept of the month: High availability with Trino

Goldman Sachs uses Trino to reduce last-mile ETL, and provide a unified way of accessing data through federated joins. Making a variety of data sets from different sources available in one spot for our data science team was a tall order. Data must be quickly accessible to data consumers, and systems like Trino must be reliable for users to trust this singular access point for their data.

In order for analysts and data scientists to use these services, they first need to trust in the system. It was vital to Goldman Sachs that Trino has high availability. In the event of any failure, another Trino cluster is available to process requests.

Integrating Trino into the Goldman Sachs internal ecosystem

Before high availability was a concern, the team had to first integrate Trino to meet their requirements. This included integrating with internal security systems, observability systems, and credential stores. It also meant adding integration with their governance services that manage cataloguing services and data discovery engines. Finally, while many of the Trino connectors that the team intended to use exist, there were many missing features and performance enhancements that would lead to a better user experience and more adoption. The team has since taken it upon themselves to work on these features and contribute them back to Trino. We will cover some of these contributions in the PR segment of this show.

Achieving scaling and high availability

Once the team had much of Trino running for some initial use cases, the next step was to improve support for more simultaneous use cases and highly concurrent workloads. The team wanted trust in the system and so as they scaled the ability to run blue-green deployments, enable resources isolation, and have highly available clusters through failures became much more pertinant.

Trino ecosystem at Goldman Sachs

Here is an overview of the Goldman Sachs ecosystem. It showcases the preexisting services that needed to connect to Trino, the catalogs supported, and the method in which Goldman Sachs achieves high availability through supporting multiple clusters in various groups.

Source: Goldman Sachs Blog

Dynamic query routing

In order to ensure that all the clusters receive an even distribution the team created services that enable dynamic query routing across the different cluster groups.

Source: Goldman Sachs Blog

Query routing components

Envoy Proxy - open source edge and service proxy that provides features such as routing, traffic management, load balancing, external authorization, rate limiting, and more.

Source: Goldman Sachs Blog

Cluster Groups - cluster group is a set of various Trino clusters that can be assigned traffic by the
Cluster Metadata Service - a service that provides the Envoy routers with all the cluster related configurations
Router Service
- Envoy Control Plane - The Envoy Control Plane is an xDs gRPC-based service, that is responsible for providing dynamic configurations to Envoy.
- Upstream Cluster Selection - Envoy provides HTTP filters to parse and modify both request and response headers. We use a custom Lua filter to parse the request and extract the x-trino-user header. Then, we call the router service, which returns the upstream cluster address.

PR of the month: PR 8956 Add support for external db for schema management in MongoDB connector

This month’s PR of the month comes from today’s guest Siddhant to solve this issue related to the MongoDB connector.

Siddhant created the issue in response to the common problem that MongoDB connector users face when they don’t have write capability in the Mongo system. Since MongoDB has no implicit schema, Trino uses a schema definition that is written to a special MongoDB database. This PR enables users without write access to create an external location to store their schema to avoid this issue.

Thanks Siddhant for raising this issue, as it’s a common issue beginners using the MongoDB connector face commonly.

Bonus PR of the month: PR 8202 Metadata for alias in Elasticsearch connector only uses the first mapping

This bonus PR of the month comes from another one of today’s guests, Suman. It solves multiple issues, meaning this feature is in high demand!

The problem brought up by these issues also have to do with how we are mapping schemas over NoSQL databases that don’t implicitely have a schema. In this case Elasticsearch stores it’s schema in an object called a mapping. This mapping can be strict or dynamic for various portions of the document that gets inserted. The object that correlates to a table in Elasticsearch is called an index. To keep Elasticsearch fast, multiple indexes are created periodically to support a given document type similar to partitioning in a database. In general, these index follow a very common mapping for a given type, but the reality is that Elasticsearch allows you to vary from the mapping. Trino currently simplifies the way this is done by only reading the first mapping and assuming that all indexes and documents follow this schema. This pull request addresses this issue by scanning a much larger sample of mappings and merging the schema to handle any conflicts. It then goes further to cache these merged mappings for a given amount of time.

Thanks for all of your continued work on this Suman! It will help a lot!

This months demo showcases a tool that Brian modified from SQL Fiddle tool called Trino Fiddle. This tool will allow Trino users to share problems and answer questions that other Trino users are facing.

Question of the month: Does Trino support CarbonData?

This month’s question of the month comes from Mahebub Sayyed on Trino Forum. Mahebub asks, “Does Trino support CarbonData?”

The answer is a little tricky, but it can be done!

CarbonData currently maintains a connector called carbondata-presto that works with an older version of Trino, version 333 (an io.prestosql version before the rename). Someone has already opened a PR to update this connector to a current Trino version that they worked on in the middle of 2021 and hasn’t made much progress recently.

That being said, you could build and use the Trino version of the connector this person was working on, and see if it works for you. If you are running on a version of Trino that is older than 351, you should be able to use the existing carbondata-presto connector.

If anyone feels motivated, it would be wonderful if you could help get this contributed to the CarbonData project, or even work with them to have it land in the Trino project!

Events, news, and various links

Blogs and resources

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

32: Trino Tardigrade: Try, try, and never die

2022-01-20T00:00:00+00:00

Guests

Andrii Rosa, Software Engineer at Starburst (@andrii-rosa-79578561).
Brian Zhan, Product Manager at Starburst (@brianzhan1).
Lukasz Osipiuk, Software Engineer at Starburst (@losipiuk).
Martin Traverso, Trino & Presto Co-founder and CTO at Starburst (@mtraverso).
Zebing Lin, Software Engineer at Starburst (@linzebing).

Trino Summit 2021

If you missed Trino Summit 2021, you can watch it on demand, for free!

Releases 367 and 368

Martin’s official announcements merged into one:

Lineage tracking for WITH clauses and subqueries.
Option to hide inaccessible columns in SELECT *.
flush_metadata_cache() procedure for the Hive connector.
Improve performance of DECIMAL type.
File-based access control for the Iceberg connector.
Support for TIME type in the SingleStore connector.
Support for BINARY type in the Phoenix connector.

Manfred’s additional notes:

Prevent data loss on DROP SCHEMA in Hive and Iceberg connectors.
New default query execution policy phased brings performance improvements.
And finally, numerous smaller improvements around memory management and query processing for our project Tardigrade.

More detailed information is available in the Trino 367 and Trino 368 release notes.

Concept of the month: Introducing Project Tardigrade

Before we jump right into the project, lets cover some of the history of ETL and data warehousing to better understand the problems that Tardigrade solves.

Why do people want to do ETL in Trino?

Trino is used for Extract, Transform, Load (ETL) workloads in many companies, like Salesforce, Shopify, Slack, and older versions of Trino at Facebook.

First, the most important thing is query speed. Queries run a lot faster in Trino. Open data stack technologies like Hive and Spark retry the query from intermediate checkpoints when something fails. However, there’s a performance cost to this. Trino has always been focused on delivering query results as quickly as possible. Now, Trino performs task-level retries enabling failure recovery where needed for the more long-running queries. More on this later though.

Second, most companies have widely dispersed and fragmented data. It’s typical for most companies to have different storage systems for different use cases. This only becomes more commonplace when a merger and acquisition happens, and you have a ton of data stored in yet another location. The acquiring company ends up having key information living in a bunch of different places. The net result is that the data engineer ends up spending weeks to write that simple dashboard. The data scientist trying to understand a trend gets impeded whenever trying to draw data from a new source and gives up.

Third, data engineers want to spend their time writing business logic, not moving SQL between engines. Unfortunately, this is where they end up spending much of their time. Many do their ad-hoc analytics in Trino, because it provides a far more interactive experience than any other engine. If they don’t just use Trino, they have a 1,000 line SQL ETL job that they now need to convert into another dialect. You just need to search “convert Spark Presto SQL Stack Overflow” to see the numerous challenges that people face moving between engines.

Whether it’s the optimizations in one engine not working in the other, a UDF in Trino not existing in Spark, strange differences in the SQL dialect tripping people up, or being extremely difficult to debug, these factors always cause a delay in completing their tasks. Data engineers are especially paranoid about converting SQL correctly. Imagine reporting an incorrect revenue metric externally, billing a user of your platform the incorrect amount, or delivering the wrong content to users due to any of these issues.

Why are people reluctant to do their ETL in Trino?

Before the drive for big data and technologies like Hadoop showed up on the scene, systems like Teradata, Netezza, and Oracle were used to run ETL pipelines in a largely offline manner. If a query failed, you simply had to restart it. Systems would brag about the low failure rate of their systems.

As Big Data came to the forefront, systems like the Google File System, that largely inspired the design for the Hadoop Distributed File System, aimed to build large distributed systems that supported fault-tolerance. In essence, faults were expected, and if a node in the system failed, no data would be lost.

At this same time, compute and storage systems were becoming separate systems. Just as storage was built with fault-tolerance, compute systems like MapReduce that processed and transformed data was also built with fault tolerance in mind. Apache Hive is a syntax and metadata layer that enables generating MapReduce jobs without having to write code. Apache Spark came on the analytics scene by introducing lineage as a way for engineers to have more control over how and when their datasets are flushed to disk. This technique, while novel, still took a very pessimistic view that allowing faults was the worst case scenario to avoid.

When Trino was created, it was designed with speed in mind. Trino creators Martin, Dain, and David chose not to add fault-tolerance to Trino as they recognized the tradeoff of fast analytics. Due to the nature of the streaming exchange in Trino all tasks are interconnected. A failure of any task results in a query failure. To support long running queries Trino has to be able to tolerate task failures.

Having an all-or-nothing architecture makes it significantly more difficult to tolerate faults, regardless of how rare they are. The likelihood of a failure grows with the time it takes to complete a query. This risk also increases as the resource demands, such as memory requirements of a query, grow. It’s impossible to know the exact memory requirements for processing a query upfront. In addition to increased likelihood of a failure, the impact of failing a long running query is much higher, as it often results in a significant waste of time and resources.

You may think all-or-nothing is a model destined to fail, especially when scaling to petabytes of data. On the contrary, Trino’s predecessor Presto was commonly used to execute batch workloads at this scale at Facebook. Even today, companies like Salesforce, Doordash, and many others, use Trino at Petabyte scale to handle ETL workloads. While it is possible, scaling Trino to run petabyte scale ETL pipelines, you really have to know what you’re doing.

Resource management is another challenge. Users don’t know exactly what resource utilization to expect from a query they submit. It is challenging to properly size the cluster and to avoid resource related failures.

In essence, most people avoid using Trino for ETL because they lack the understanding of how to correctly configure Trino at scale.

What are the limitations of the current architecture?

In the current architecture Trino plans all tasks for processing a specific query upfront. These tasks interconnect with one another as the results from one task are the input for the next. This interdependency is necessary but if any task fails along the way, it breaks the entire chain.

Data is streamed through task graph with no intermediate checkpointing. The query execution has just internal, volatile state of operators running within tasks.

As stated before, this architecture has advantages. Most notably high throughput and low latency. Yet it implies some limitations too. Probably the most natural one is that it does not allow for granular failure recovery. If one of the tasks dies there is no way to restart processing from some intermediary state. The only option is to rerun the whole query from the very beginning.

The other notable limitation is around memory consumption. With static task placement we have little control over resource utilization on nodes.

Finally, the current architecture makes many decisions upfront during query planning. The engine creates a query plan based on incomplete data using table statistics, or blindly, if statistics are not available. After the coordinator creates the plan, and query processing started, there aren’t many ways to adapt. We have much more information during query execution at runtime. For example, we cannot change the number of tasks for a stage. If we observe data skew, we can’t move tasks away from the overworked node, so the affected tasks have more resources at hand. We cannot change the plan for a subquery, if we notice that decision already made is not optimal.

Trino engine improvements with Project Tardigrade

Project Tardigrade aims to break the all-or-nothing execution barriers. It opens many new opportunities around resource management, adaptive query optimization, and failure recovery. We will use a technique called spooling that stores intermediate data in an efficient buffering layer at stage boundaries. The buffer stores intermediate results for the duration of a query or a stage, depending on the context. The project is named after the microscopic Tardigrades that are the world’s most indestructible creatures, akin to the resiliency we are adding to Trino.

Buffering intermediate results makes it possible to execute queries iteratively. For example, the engine can process one or several tasks at a time, effectively reducing memory pressure, and allow memory intensive queries to succeed without a need to expand the cluster. Tardigrade can significantly lower cost of operation, specifically for the situation when only a small number of queries requires more memory than available.

Adaptive planning

The engine may also decide to re-optimize the query at stage boundaries. When
the engine buffers the intermediate data, it is possible to get better insight into the nature of the data as it’s processed and adapt query plans accordingly. For example, when the cost based optimizer makes a bad decision, because of incorrect statistics or estimates, it can pick the wrong type of join, or a suboptimal join order. The engine can then suspend the query, re-optimize the plan, and resume processing. Additionally, it may allow the engine to discover skewed datasets, and change query plans accordingly. This may significantly improve efficiency and landing time for workloads that are JOIN heavy.

Resource management

Iterative query processing allows us to be more flexible at resource management. Resource allocation can be adjusted as the queries run. For example, when a cluster is idle, we may allow a single query to utilize all available resources on a cluster. When more workload kicks in, the resource allocation for the initial query can be gradually reduced, and available resources can be granted to newly submitted workloads. With this model it is also significantly easier to implement auto scaling. When the submitted workload requires more resources than currently available in the cluster, the engine can request more nodes. Or the opposite, if the cluster is underutilized it is easier to return resources when there’s no need to wait for slow running tasks. Being able to better manage available resources, and adjust the resource pool based on the current workload submitted, would make the engine significantly more cost effective.

Fine-grained failure recovery

Last, but not least, with project Tardigrade we are going to provide fine-grained failure recovery. The buffering introduced at stage boundaries allows for a transparent restart of failed tasks. Fine grained failure recovery would make completion time for ETL pipelines significantly more predictable. Also, it opens the opportunity of running ETL workloads on much cheaper, widely available spot instances that can further optimize operational costs.

Opportunities that Tardigrade opens

In summary, in Project Tardigrade we work on the following improvements to Trino:

Predictable query completion times.
The ability to scale up or down to match the workload at runtime.
Fine grained resource management.
Non-homogenous hardware.
Adaptive resource limits for tasks.
Graceful Shutdown improvement.
Cheaper compute costs using spot instances that have lower failure guarantees.
Enables adaptive query replanning during runtime as context changes.
Handle situations where certain tasks are affected by data skew.

Efficient exchange data buffering implementation

This all sounds incredible, but it begs the question of how to best implement these buffers? Enabling task-level retry requires us to store intermediate exchange data to a “distributed buffer”. In order to minimize the level of disturbance buffering has on the query performance, there needs to be careful design consideration.

A naive implementation is to use a cloud object storage as intermediate storage. This allows you to scale without maintaining a separate service. This is the initial option we are using as a prototype buffer. It is intended as a proof-of-concept and should be good enough for small clusters of ten to twenty nodes. This option can be slow and won’t support high-cardinality exchanges. The number of files grows quadratically with the number of partitions. Trino then has keep track of the metadata of all these files in order to plan and schedule which tasks require which files for the query. With the high amount of files, there is memory cost to hold that metadata. There is also a penalty for the time and bandwidth it takes on the network to list them all. This is a well know many small files problem in big data.

Distributed memory with spilling as a buffer

This solution requires a long-running managed service, but improves performance. Depending on the design we choose, we can use write-ahead buffers to output data belonging to the same partition and provide sequential I/O to downstream tasks.

Demo of the month: Task retries with Project Tardigrade

In this months demo, Zebing showcases task retries using Project Tardigrade after throwing his EC2 instance out the window! See what happens next…

PR of the month: PR 10319 Trino lineage fails for `AliasedRelation`

This month’s PR of the month was created to resolve an issue reported by Lyft Data Infrasturcture Engineer, Arup Malakar (@amalakar).

Arup reported that Trino lineage fails to capture upstream columns when join and transformation is used. This issue more generally applied to any column used with function where its argument are from a AliasedRelation. Starburst engineer, Praveen Krishna (@Praveen2112), resolved the issue two days later, and with the help of Arup and the Lyft team, tested the fix works!

Thanks to both Arup and Praveen for the fix!

Question of the month: How do you cast JSON to varchar with Trino?

This month’s question of the month comes from Borislav Blagoev on Stack Overflow. He asks, “How do you cast JSON to varchar with Trino?”

This was answered by Guru Stron:

Use json_format/ json_parse to handle json object conversions instead of casting:

select json_parse('{"property": 1}') objstring_to_json, json_format(json '{"property": 2}') jsonobj_to_string

Output:

objstring_to_json	jsonobj_to_string
{“property”:1}	{“property”:2}

Events, news, and various links

Blogs and resources

How to ETL at Petabyte-Scale with Trino

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

31: Trinites II: Trino on AWS Kubernetes Service

2021-12-16T00:00:00+00:00

Trino Summit 2021

If you missed Trino Summit 2021, you can watch it on demand, for free!

Releases 365 and 366

Martin’s official announcement mentioned the following highlights:

Trino 365

Aggregations in MATCH_RECOGNIZE
Support for TRUNCATE TABLE
Compatibility with Pinot 0.8.0
HTTP proxy support for OAuth2 authentication
Many improvements to Iceberg connector

Release notes: https://trino.io/docs/current/release/release-365.html

Trino 366

Support for automatic query retries
Support for DENY security rules
Performance optimizations

Release notes: https://trino.io/docs/current/release/release-366.html

Manfred’s additional notes:

Cool new SQL like TRUNCATE TABLE and support for time travel
contains function for IP check in CIDR
Lots of performance and correctness fixes on Hive and Iceberg connectors
Drop support for old Pinot versions
Support for Hive to Iceberg redirects
Automatic TLS for internal communication support for Java 17

And a last note, full Java 17 support is becoming a reality.

More detailed information is available in the 365 and 366 release notes.

To play around with query retries, you need to set the retry_policy session variable to QUERY with the following command SET SESSION retry_policy=QUERY;

Log4Shell

There’s a new vulnerability in town that has the potential to affect Java projects that use some Log4j2 versions. It is called Log4Shell, and it does not affect Trino. Read the blog for more details.

Concept of the month: ReplicaSets, Deployments, and Services

In the first installment of Trinetes, we talked about what containerization is and why we use it. We covered the difference between tools like docker-compose and container orchestration systems like Kubernetes (k8s). Finally, we went over the first k8s object called a pod.

As a reminder, a pod is the basic unit of deployment in a k8s cluster. In this episode, we cover how to scale, deploy, and connect these pods. If you are missing some context, you should review the first installment of this series.

ReplicaSets

Replicas make one or more instances based on the same pod definitions. In k8s, the object used to manage replication is a ReplicaSet.

ReplicaSets provide high availability by managing multiple instances based on a pod definition in the k8s cluster. Kubernetes automatically brings up any failed pod instances that go down in a ReplicaSets based on the number of replicas you specify in the definition.

Replication also enables load balancing IO traffic over multiple pods. You gain the flexibility to scale up or down as traffic increases or decreases without any downtime.

To scale the number of pods in a live ReplicaSet, you can update the replicas value in the ReplicaSet definition file, then running the following command to update it:

kubectl replace -f replicaset-definition.yml

You can also edit the live ReplicaSet without changing the local file:

kubectl edit replicaset <replicaset-name>

Labels and selectors

Kubernetes objects have labels which are just key/value properties used to identify and dynamically group k8s objects. Labels should be meaningful and relevant to k8s users to easily comprehend things like which application, version, component, and environment certain objects belong to. Labels are shared across instances, and so they are not unique.

Selectors specify the grouping of instance to target a set of objects when deploying or applying other operations over these objects. For example, a ReplicaSet that identifies a set of pods with its selector to manage. When creating the ReplicaSet, k8s creates new pods defined in the ReplicaSet’s selector definition. If the pods crash, k8s brings up new pods and associates the new pods with the ReplicaSet.

Deployments

Deployment objects allow you to take a ReplicaSet, and perform actions on that set like creation, a rolling update, rollback, pod update, and so on.

Source: https://www.udemy.com/course/learn-kubernetes/

The best way to start making sense of these concepts is to look at the k8s configuration files.

helm template tcb trino/trino --version 0.3.0

Below is the generated deployment configuration, trino/templates/deployment-worker.yaml with comments that delineate where different sections of the configuration are defining.

#-------------------------Deployment-----------------------------
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tcb-trino-worker
  labels:
    app: trino
    chart: trino-0.3.0
    release: tcb
    heritage: Helm
    component: worker
spec:
#-------------------------ReplicaSet-----------------------------
  replicas: 2
  selector:
    matchLabels:
      app: trino
      release: tcb
      component: worker
  template:
#----------------------------Pod---------------------------------
    metadata:
      labels:
        app: trino
        release: tcb
        component: worker
    spec:
      volumes:
        - name: config-volume
          configMap:
            name: tcb-trino-worker
        - name: catalog-volume
          configMap:
            name: tcb-trino-catalog
      imagePullSecrets:
        - name: registry-credentials
      containers:
        - name: trino-worker
          image: "trinodb/trino:latest"
          imagePullPolicy: IfNotPresent
          env:
            []
          volumeMounts:
            - mountPath: /etc/trino
              name: config-volume
            - mountPath: /etc/trino/catalog
              name: catalog-volume
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          livenessProbe:
            httpGet:
              path: /v1/info
              port: http
          readinessProbe:
            httpGet:
              path: /v1/info
              port: http
          resources:
            {}

ConfigMap

You may have noticed that the pods define volumes that are referring to an object called ConfigMap. This is a way to store non-confidential data in the form of key-value pairs.

ConfigMaps are how the Trino chart loads the Trino configurations in the /etc/trino directory on the containers. The ConfigMap file, trino/templates/configmap-worker.yaml, defines the files loaded into the worker nodes. The only real difference of the ConfigMap is in the config.properites file specifying if the node is a coordinator or not.

apiVersion: v1
kind: ConfigMap
metadata:
  name: tcb-trino-worker
  labels:
    app: trino
    chart: trino-0.3.0
    release: tcb
    heritage: Helm
    component: worker
data:
  node.properties: |
    node.environment=production
    node.data-dir=/data/trino
    plugin.dir=/usr/lib/trino/plugin

  jvm.config: |
    -server
    -Xmx8G
    -XX:+UseG1GC
    -XX:G1HeapRegionSize=32M
    -XX:+UseGCOverheadLimit
    -XX:+ExplicitGCInvokesConcurrent
    -XX:+HeapDumpOnOutOfMemoryError
    -XX:+ExitOnOutOfMemoryError
    -Djdk.attach.allowAttachSelf=true
    -XX:-UseBiasedLocking
    -XX:ReservedCodeCacheSize=512M
    -XX:PerMethodRecompilationCutoff=10000
    -XX:PerBytecodeRecompilationCutoff=10000
    -Djdk.nio.maxCachedBufferSize=2000000

  config.properties: |
    coordinator=false
    http-server.http.port=8080
    query.max-memory=4GB
    query.max-memory-per-node=1GB
    query.max-total-memory-per-node=2GB
    memory.heap-headroom-per-node=1GB
    discovery.uri=http://tcb-trino:8080

  log.properties: |
    io.trino=INFO

The only other ConfigMap defines the catalog properties files in the /etc/trino/catalog folder. This ConfigMap only defines two catalogs. They expose the TPC-H and TPC-DS benchmark datasets.

apiVersion: v1
kind: ConfigMap
metadata:
  name: tcb-trino-catalog
  labels:
    app: trino
    chart: trino-0.3.0
    release: tcb
    heritage: Helm
    role: catalogs
data:
  tpch.properties: |
    connector.name=tpch
    tpch.splits-per-node=4
  tpcds.properties: |
    connector.name=tpcds
    tpcds.splits-per-node=4

Networking

Unlike in the Docker world, where it runs on the host directly where you can expose the container, pods in a k8s cluster run in a private network. Kubernetes exposes the internal IP address of the pod with the IP address of the k8s node and a unique port.

These IP addresses can be used to address pods internally, it’s not a good idea as these IP addresses are dynamic and subject to change upon termination and recreation. For this, you set up routing that handles addressing via pod name vs IP address.

When you have multiple k8s nodes, you have multiple IP addresses set up for the nodes. The routing software must be set up to handle the assignment of the internal networks to each nodes to avoid conflicts across the cluster. This type of functionality exists in cloud services, such as Amazon EKS, Google GKE, and Azure AKS.

Services

Services establish connectivity between different pods and can make pods available from the external k8s node IP address. This enables loose coupling between microservices in applications.

The above example is showing a NodePort service. There are three service types.

ClusterIP - the service creates a virtual IP inside the cluster to enable communication between different services. This service is the default when you don’t specify a type value under spec in the configuration.
NodePort - is used to expose the internal address of a pod using the IP address and port of the node it is running on.
Load Balancer - this service creates a load balancer for the application in supported cloud providers. We won’t cover this one, but this is used when we create our cluster in EKS using the eksctl.

Here’s a diagram of the ClusterIP networking between different ReplicaSets.

Source: https://www.udemy.com/course/learn-kubernetes/

NodePort’s establish connectivity to a specific ReplicaSet of pod instances. It cannot make a generically accessible IP address for services to communicate between one another.

In our case, we configure an external IP address for the coordinator. The Helm chart defines a ClusterIP service to accomplish this. Notice the selector targets the Trino app, the release label, and only the coordinator component, which we know is one node.

apiVersion: v1
kind: Service
metadata:
  name: tcb-trino
  labels:
    app: trino
    chart: trino-0.3.0
    release: tcb
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 8080
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: trino
    release: tcb
    component: coordinator

NodePort

The NodePort Service type, creates a proxy service to forward traffic to a specific port on the node from the pod.

Source: https://www.udemy.com/course/learn-kubernetes/

There are three ports when setting up a NodePort.

TargetPort - is the port number on the pod itself, where the service forwards to.
Port - is the port used by the service.
NodePort - is the port that is exposed by the worker node and made available externally. NodePorts can only be in the range of 30000 - 32767.

The only required port to set is port. By default targetPort is the same as port and nodePort is automatically assigned a free port in the allowed range. ports is also an array which is why the - char is used.

Amazon EKS (Elastic Kubernetes Service)

Amazon EKS is a managed container service to run and scale Kubernetes applications in the cloud. EKS provides k8s clusters in the cloud for you without your having to manage the whole k8s services and platform. Unlike with your own k8s cluster, you can’t log into the control plane node in EKS, although you won’t need to. You are able to access workers which are usually EC2 nodes.

There are many steps involved in setting up a Kubernetes cluster on EKS, unless you use a simple command line tool called eksctl that provisions the cluster for you.

eksctl

From the eksctl website:

eksctl is a simple CLI tool for creating and managing clusters on EKS - Amazon’s managed Kubernetes service for EC2. It is written in Go, uses CloudFormation, was created by Weaveworks and it welcomes contributions from the community. Create a basic cluster in minutes with just one command.

Demo of the month: Deploy Trino k8s to Amazon EKS

First, you’ll need to install the following tools if you haven’t done so already:

Then you need to add your IAM credentials to the ~/.aws/credentials file.

Check the latest k8s version that is available on EKS. https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html

eksctl create cluster \
 --name tcb-cluster \
 --version 1.21 \
 --region us-east-1 \
 --nodegroup-name k8s-tcb-cluster \
 --node-type t2.large \
 --nodes 2

The command completed in 10 to 15 minutes. This is the first output you see:

2021-12-16 01:25:17 [ℹ]  eksctl version 0.76.0
2021-12-16 01:25:17 [ℹ]  using region us-east-1
2021-12-16 01:25:17 [ℹ]  setting availability zones to [us-east-1a us-east-1e]
2021-12-16 01:25:17 [ℹ]  subnets for us-east-1a - public:192.168.0.0/19 private:192.168.64.0/19
2021-12-16 01:25:17 [ℹ]  subnets for us-east-1e - public:192.168.32.0/19 private:192.168.96.0/19
2021-12-16 01:25:17 [ℹ]  nodegroup "k8s-tcb-cluster" will use "" [AmazonLinux2/1.21]
2021-12-16 01:25:17 [ℹ]  using Kubernetes version 1.21
2021-12-16 01:25:17 [ℹ]  creating EKS cluster "tcb-cluster" in "us-east-1" region with managed nodes

After some time, you notice that two ec2 instances have come up. The final output of the tool should look like this.

2021-12-16 02:00:17 [ℹ]  waiting for at least 2 node(s) to become ready in "k8s-tcb-cluster"
2021-12-16 02:00:17 [ℹ]  nodegroup "k8s-tcb-cluster" has 2 node(s)
2021-12-16 02:00:17 [ℹ]  node "ip-192-168-2-123.ec2.internal" is ready
2021-12-16 02:00:17 [ℹ]  node "ip-192-168-55-167.ec2.internal" is ready
2021-12-16 02:00:18 [ℹ]  kubectl command should work with "~/.kube/config", try 'kubectl get nodes'
2021-12-16 02:00:18 [✔]  EKS cluster "tcb-cluster" in "us-east-1" region is ready

Take special note that eksctl overwrote your k8s configuration to point you to the EKS cluster instead of a local cluster. To test that you can connect, run:

kubectl get nodes

You should see two nodes running. Now everything is simple. All you have to do to install Trino is reuse the Helm chart that we used to locally deploy Trino. Now, with the exact same command, you deploy to EKS since the tool updated your settings.

helm install tcb trino/trino --version 0.3.0

After you’ve installed the Helm chart, wait a minute or two for the Trino service to fully start and run:

kubectl get deployments

You should see the output that the coordinator and both workers are available.

NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
tcb-trino-coordinator   1/1     1            1           67s
tcb-trino-worker        2/2     2            2           67s

To connect to the cluster, the Helm output gives pretty good instructions on how to create a tunnel from the cluster to your local laptop.

Get the application URL by running these commands:
  export POD_NAME=$(kubectl get pods --namespace default -l "app=trino,release=tcb,component=coordinator" -o jsonpath="{.items[0].metadata.name}")
  echo "Visit http://127.0.0.1:8080 to use your application"
  kubectl port-forward $POD_NAME 8080:8080

Run that, then go to http://127.0.0.1:8080, and you should see the Trino UI.

To clear out the Helm install, run:

kubectl delete service --all
kubectl delete deployment --all
kubectl delete configmap --all

To tear down the entire k8s cluster, run:

eksctl delete cluster --name test-cluster --region us-east-1

PR of the month: PR 8921: Support TRUNCATE TABLE statement

This weeks PR of the month implements TRUNCATE TABLE. This command is very similar to DELETE statements, with the exception that it does not perform deletes on individual rows. This ends up becoming a much faster operation that DELETE as it uses fewer system and logging resources.

Thanks to Yuya Ebihira for adding the support for TRUNCATE TABLE.

Question of the month: How do I run `system.sync_partition_metadata` with different catalogs?

This week’s question of the month comes from Yu on Slack. Yu asks:

Hi team, in the following system procedure, how can we specify the catalog name? system.sync_partition_metadata(schema_name, table_name, mode, case_sensitive) We are using multiple catalogs and we need to call this procedure against non-default catalog.

I answered this with a link back to our fifth episode :

You need to set the catalog either in the jdbc string as I do in the video, or you need to set the session catalog variable, https://trino.io/docs/current/sql/set-session.html

Events, news, and various links

Blogs and resources

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

30: Trino and dbt, a hot data mesh

2021-11-17T00:00:00+00:00

Guests

José Cabeda, Data Engineer at Talkdesk (@jecabeda).
Przemek Denkiewicz, Cloud Ecosystem Engineer at Starburst (@hovaesco).

Trino Summit 2021

If you missed Trino Summit 2021, you can watch it on demand, for free!

Release 364

Trino 364 shipped on the first of November, just after our last episode. Martin’s official announcement mentioned the following highlights:

Support for dynamic filtering in Iceberg connector
Performance improvements when querying small files
Procedure to merge small files in Hive tables
Support for Cassandra UUID type
Support for MemSQL datetime and timestamp types

Manfred’s additional notes:

ALTER MATERIALIZED VIEW ... RENAME TO
A whole bunch of performance improvements
Elasticsearch connector no longer fails with unsupported types
A lot of improvements on Hive and Iceberg connectors
Hive connector has optimize procedure now!
Parquet and avro fixes and improvements
Web UI performance improvement for long query texts

More detailed information is available in the release notes.

Concept of the week: Trino and dbt, a hot data mesh

Data mesh, the buzzword that follows data lakehouse, may feel rather irrelevant for many. This is especially true for those that just want to move from a Hive and HDFS cluster to storing data in object store, or from a cloud data warehouse and query it with Trino.

While data mesh is certainly in the hype cycle phase, it’s actually not a new idea and has very sound principles. Many companies have written their own software and created organizational policies that align with the strategies outlined by the data mesh principles. In essence, these principles aim to make data management for analytics platforms decentralized. This means decentralizing the infrastructure and data engineers managing it to different domains (or products) within a company.

What’s really exciting about data mesh is that much of the technology today makes these theoretical principles more of a reality without having to invent your own services. The author of data mesh, Zhamak Dehghani, lays out 4 principles that characterize a data mesh:

Domain-oriented, decentralized data ownership and architecture
Data as a product
Self-serve data infrastructure as a platform
Federated computational governance

Let’s see what the engineers from Talkdesk are doing to implement their data mesh.

Talkdesk

Talkdesk is a contact center as a service. Talkdesk was created at a Twilio Hackathon in 2011. They just hit a 10 billion dollar valuation. As a fast growing startup, they are growing their product strategy at a fast pace, and deal with a large data sets to analyze regularly.

The Talkdesk product is deployed in cloud infrastructure and provides all the infrastructure for operating a call center. Its architecture is heavily event-driven. Dealing with realtime events at scale is difficult and requires a reactive and flexible architecture.

The early architecture for the analytics platform followed a traditional approach using Spark and Fivetran to ingest data into Redshift. It had various pipelines to update the data for downstream consumption.

This centralized workflow made communication across data entity management much simpler as it all exists on the same team. However, scaling caused increased backlogs, which delayed analysis and deployments. It also made it difficult to handle different use cases like realtime and historical use cases.

The use cases between analytics and transactional are varied and overlapping. Live data typically feeds into stateful databases that updates as data arrives. To analyze data in motion, you need a realtime database. Historical data exists to keep a backup of multiple copies of different states over time. This enables trend analysis over longer periods of time versus right now. One challenge Talkdesk faced was realizing a robust architecture that satisfies analyzing live data that gets the latest changes as they arrive to OLTP databases while meeting all the analytics use cases.

To enable analytics across the various use cases, Talkdesk integrated Trino into their workflow to read data across both live and historic data and merge them. Using Trino enabled reading from live data feeding into their stateful data stores, and reads across historic data stores to produce data in the form needed to support Talkdesk products.

Trino is also used to hide the complexity of the data platform, and allows merging data across mulitple relational and object stores.

Why dbt?

In episode 21 we discussed using dbt and Trino in detail. As we mentioned there:

dbt is a transformation workflow tool that lets teams quickly and collaboratively deploy analytics code, following software engineering best practices like modularity, CI/CD, testing, and documentation. It enables anyone who knows SQL to build production-grade data pipelines.

You can achieve modular, repeatable, and testable units of processing by defining various models and definitions to the data pipelines. For example:

Using the definitions above, Talkdesk engineers were able to consolidate all these tasks into a much more simplified graph of operations.

Why data mesh?

While a lot of focus has gone into the technology aspects of data mesh, there is also a lot to be said about the implications on the data team and socio-political policies that come with data mesh. Talkdesk also made structural changes to their team to improve their data mesh strategy.

How data mesh affects the everyday life of data engineers?

There is a real fear that comes around when management changes business policies. It can be hard to tell how these policies trickle down and affect the engineer’s every day work life. In general, engineers become more entrenched in different domains rather than trying to manage all domains under one architecture. Data engineers are distributed to product teams and specialize in the domain’s data models. They also have specific knowledge of how to use the self-service platform to integrate across other teams.

Comparing microservices-based applications to the data mesh

When we think of a functional system for deploying and managing microservices-based applications, there are several features that we’ve come to expect. It is very easy to compare the features of microservices-based applications to features of a data mesh. Data Mesh: A Software Engineer’s Perspective blog.

PR of the week: Partitioned table tests and fixed PR 9757

This weeks PR of the week is for the Iceberg connector. Release 364 had quite a few improvements for Iceberg and handled small issues that could cause query failure in some scenarios. This PR addressed a query failure when reading a partition on a UUID column.

Thanks to Piotr Findeisen for fixing this and many other bugs, as well as, improving performance in the Iceberg connector!

Question of the week: What’s the difference between `location` and `external_location`?

This week’s question of the week comes from Aakash Nand on Slack and ported to Trino Forum. Aakash asks:

When creating a Hive table in Trino, what is the difference between external_location and location . If I have to create external table I have to use external_location right? What is the difference between these two?

This was answered Arkadiusz Czajkowski:

Tables created with location are managed tables. You have full control over them from their creation to modification. tables created with external_location are tables created by third party systems. We just access them mostly for read. I would encourage you to use location in your case.

Events, news, and various links

Blogs and resources

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

29: What is Trino and the Hive connector

2021-10-28T00:00:00+00:00

Release 364

Release 364 is just around the corner, here is Manfred’s release preview:

ALTER MATERIALIZED VIEW ... RENAME TO
A whole bunch of performance improvements
Elasticsearch connector no longer fails if fields with unsupported types exist
Hive connector has optimize procedure now!
Parquet and Avro fixes and improvements

Concept of the week: What is Trino?

Trino is the project created by Martin Traverso, Dain Sundstrom, David Phillips, and Eric Hwang in 2012 to replace the 300PB Hive data warehouse at Facebook. The goal of Trino is to run fast ad-hoc analytics queries over big data file systems like HDFS and object stores like S3.

An initially unintended but now characteristic feature of Trino is its ability to execute federated queries over various distributed data sources. This includes, but is not limited to: Accumulo, BigQuery, Apache Cassandra, ClickHouse, Druid, Elasticsearch, Google Sheets, Apache Iceberg, Apache Hive, JMX, Apache Kafka, Kinesis, Kudu, MongoDB, MySQL, Oracle, Apache Phoenix, Apache Pinot, PostgreSQL, Prometheus, Redis, Redshift, SingleStore (MemSQL), Microsoft SQL Server.

How does Trino query across everything from data lakes, SQL, and NoSQL databases at unprecedented speeds? It helps to start by going over Trino’s architecture:

Source: Trino: The Definitive Guide.

Trino consists of two types of nodes, coordinator and worker nodes. The coordinator plans, and schedules the processing of SQL queries. The queries are submitted by users directly or with connected SQL reporting tools. The workers actually carry out more of the processing by reading the data from the source or performing various operations within the task(s) they are assigned.

Source: Trino: The Definitive Guide.

Trino is able to query over multiple data types by exposing a common interface called the SPI (Service Provider Interface) that enables the core engine to treat the interactions with each data source the same. Each connector must then implement the SPI which includes exposing metadata, statistics, data location, and establishing one or more connections with an underlying data source.

Source: Trino: The Definitive Guide.

Many of these interfaces are used in the coordinator during the analysis and planning phases. The analyzer, for example, uses the metadata SPI to make sure the table in the FROM clause actually exists in the data source.

Source: Trino: The Definitive Guide.

Once a logical query plan is generated, the coordinator then converts this to a distributed query plan that maps actions into stages that contain tasks to be run on nodes. Stages model the sequence of events and a directed acyclic graph (DAG).

Source: Trino: The Definitive Guide.

The coordinator then schedules tasks over the worker nodes as efficiently as possible, depending on the physical layout and distribution of the data.

Source: Trino: The Definitive Guide.

Data is split and distributed across the worker nodes to provide inter-node parallelism.

Source: Trino: The Definitive Guide.

Once this data arrives to the worker node, it is further divided and processed in parallel. Workers submit the processed data back to coordinator. Finally, the coordinator provides the results of the query to the user.

PR 8821 Add HTTP/S query event logger

Pull request 8821 enables Trino cluster owners to log query processing metadata by submitting it to an HTTP endpoint. This may be used for usage monitoring and alarming, but it might also be used to extract analytics on cluster usage, such as tables/column usage metrics.

Query events are serialized to JSON and sent to the provided address over HTTP or over HTTPS. Configuration allows selecting which events should be included.

Thanks for the contribution mosiac1 and others at Bloomberg!

Read the docs to learn more about this exciting feature!

Question of the week: Does the Hive connector depend on the Hive runtime?

This week’s question covers a lot of the confusion around the Hive connector. In short, the answer is that the Hive runtime is not required. There’s more information available in the Intro to the Hive Connector blog.

Videos

An Overview of the Starburst Trino Query Optimizer (Karol Sobczak)

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

28: Autoscaling streaming ingestion to Trino with Pravega

2021-10-14T00:00:00+00:00

Guests

Derek Moore, Software Senior Principal Engineer at Dell EMC (@derekm00r3).
Andrew Robertson,Principal Software Engineer at Dell EMC (@andrew-robertson).
Karan Singh, Software Engineer 2 at Dell EMC (@singhkaranrakesh).

Trino Summit 2021

Get ready for Trino Summit, coming October 21st and 22nd! This annual Trino community event is where we gather practitioners that deploy Trino at scale and share their experiences and best practices with the rest of the community. While the planning for this event was a bit chaotic due to the pandemic, we have made the final decision to host the event virtually for the safety of all the attendees. We look forward to seeing you there, and can’t wait to share more information in the coming weeks!

Release 363

Official announcement items from Martin:

New HTTP event listener plugin
Insert overwrite for S3-backed tables
Support for Elasticsearch scaled_float type
Support for Cassandra tuple type
Support for time type in MySQL connector
Support for SQLServer datetimeoffset type

Manfred’s additional notes:

Misc performance and memory usage improvements
SHOW ROLES fix
EXPLAIN ANALYZE fix for estimate display
Numerous improvements for Parquet files in Hive and Iceberg connectors

More info at https://trino.io/docs/current/release/release-363.html.

Concept of the week: Event stream abstractions and Pravega

Events and streams

What is an event? This sounds like a silly question when asked generally. The answer is less clear when discussing event-driven systems though. An event is an action or occurrence that is captured by either a sensor, or a generated by a source system, and emitted to a sink system. Some examples include user events from an application, system events in telemetry systems, or sensor events from monitoring applications.

What is an event stream? Now knowing what an event is, an event stream is an unbounded set of events that are tracked over time.

In this simple view, an event stream contains a sequential list of events. The list contains events that have been processed, and some that still need to be processed.

Cloud Native Computing Foundation Presentation: Source.

This is very different from a more realistic view of event streams that considers that events arrive and are processed in parallel. Event load may also fluctuate as events may burst around specific events or events have specific periodic behavior. While taking event ingest (writes) into consideration, it is also important to consider event egress (reads) as part of the problem of representing event streams.

Cloud Native Computing Foundation Presentation: Source.

Pravega and segments

Engineers at Dell Labs wanted to find a better abstraction to solve for the problems they saw in existing event streaming systems. This included how to address this type of constant shift in scaling, while also addressing the brittle storage abstractions that even streams use today. The storage abstraction needs to allow for both real-time and historical analytics. The data along a particular transaction also needs to be consistent.

Cloud Native Computing Foundation Presentation: Source.

Their solution is Pravega. The core of Pravega models streams built around a storage unit called a segment. A segment is an append-only sequence of bytes (not events/records). This offers a greater level of flexibility and better parallelism and serialization over streams. Pravega stream writers are then able to write in parallel increasing ingest throughput.

Cloud Native Computing Foundation Presentation: Source.

You can use routing keys to map events to particular segments. Pravega enforces order within specific keys, but does not guarantee ordering of events across keys. The tradeoff is providing ordering of events versus higher parallelism and better performance.

With segments, you can also scale up and scale down the number of segments depending on the workload you’re experiencing. Another compelling capability this enables is managing transactions in the stream. As writers submit data, they write to a temporary segment, which are merged to a permanent segment on commit.

Cloud Native Computing Foundation Presentation: Source.

The following diagram displays autoscaling splits and merges as specific routing keys become more popular. To provide a clearer example, say that the routing keys are actually just hash geo location values for a taxi app that are mapped between zero and one. As certain locations become crowded, lets say that a lot of people are going home for the work day, and many taxis are in the downtown location. The locations mapped to the downtown routing keys can automatically trigger a split, and once the rush hour is over, it merges these segments as traffic slows down.

Pravega Docs: Source.

Pravega architecture

The Pravega architecture comes with writers groups and reader groups that scale up and down along with the autoscaling applied to the segments. It consists of a controller that maintains stream metadata and the segment store that works off of tier one storage (Apache Bookkeeper) and tier two storage (Object storage).

Pravega Docs: Source.

Just like Trino, Pravega also aims to build a rich set of connectors with systems that act as a source and sink. This includes a connector used for Trino.

Pravega Docs: Source.

Pravega compared to other event streaming platforms.

This chart is very helpful resource to summarize Pravega against other popular streaming platforms. This comes from the Pravega site so be sure to check for an up to date list of these features moving forward.

	Pravega	Kafka	Pulsar
Transactions	✅	✅	✅
Event streams	✅	✅	✅
Long-term retention	✅		✅
Durable by default	✅		✅
Auto-scaling	✅
Ingestion of large data (video)	✅
efficient at high partition counts	✅
Consistent state replication	✅
Key-value tables	✅

Comparison between Pravega, Kafka, and Pulsar: Source

Demo of the week: Querying Pravega from Trino

This week the Pravega teams demonstrates an example from their getting-started tutorial for the Trino connector.

PR of the week: Pravega presto-connector PR 49

This weeks PR of the week doesn’t come from the Trino repository this week but rather the presto-connector repository. The Trino portion of the repository was committed by Dell engineer Karan Singh. As it states, this now makes Pravega available from Trino along with the original Presto connector.

Thanks Karan for adding Trino and Andrew for writing the original Presto-Pravega connector!

Question of the week: What is the point of Trino Forum and what is the relationship to Trino Slack?

Our question of the week comes from the new Trino Forum by Starburst. Brian and a few others at Starburst created. Slack is a much more adhoc platform for people to work through problems rather than to search and find solutions to problems. The Trino community has such a great amount of knowledge accumulated in this Slack channel, but there is no way for people to find answers unless they have joined here and none of the information we discuss can be found by a search engine like Google.

Further, a lot of the answers are scattered between different conversations and this too can be condensed and simplified. I pondered about the best way for us to expose this and though maybe to add an FAQ page on but this would get stale quickly and this would require a lot of work to be maintained at scale without a crowdsourcing element. Instead, starting a [Discourse forum](https://www.discourse.org) (not to be confused with Discord) acts as a central repository of knowledge makes this information easily searchable. The forum is maintained by some of us at Starburst but over time we want more moderators from the community (this happens through merit and consistency using Discourse Trust levels).

Events, news, and various links

Blogs and resources

Pravega: Rethinking Storage For Streams

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

27: Trino gits to wade in the data LakeFS

2021-09-30T00:00:00+00:00

Guests

Paul Singman, Developer Advocate at Treeverse (@datawhisp).

Trino Summit 2021

Get ready for Trino Summit, coming October 21st and 22nd! This annual Trino community event is where we gather practitioners that deploy Trino at scale, and share their experiences and best practices with the rest of the community. While the planning for this event was a bit chaotic due to the pandemic, we have made the final decision to host the event virtually for the safety of all the attendees. We look forward to seeing you there, and can’t wait to share more information in the coming weeks!

Concept of the week: LakeFS and Git on object storage

LakeFS offers git-like semantics over your files in the data lake. Akin to the versioning you can do on Iceberg, you can also version your data with LakeFS, and roll back to previous commits when you make a mistake. LakeFS allows you to roll out new features in production or prod-like environments with ease and isolation from the real data. Join us as we dive into this awesome new way to approach versioning on your data!

Why we built LakeFS: Source.

Features

Exabytes scale version control
Git-like operations: branch, commit, merge, revert
Zero copy branching for frictionless experiments
Full reproducibility of data and code
Pre-commit/merge hooks for data CI/CD
Instantly revert changes to data

Use cases

In development

Experiment - try new tools, upgrade versions, and evaluate code changes in isolation. By creating a branch of the data you get an isolated snapshot to run experiments over, while others are not exposed. Compare between branches with different experiments or to the main branch of the repository to understand a change’s impact.
Debug - checkout specific commits in a repository’s commit history to materialize consistent, historical versions of your data. See the exact state of your data at the point-in-time of an error to understand its root cause.
Collaborate - avoid managing data access at the two extremes of either treating your data lake like a shared folder or creating multiple copies of the data to safely collaborate. Instead, leverage isolated branches managed by metadata (not copies of files) to work in parallel.

During deployment

Version Control - deploy data safely with CI/CD workflows borrowed from software engineering best practices. Ingest new data onto an isolated branch, perform data validations, then add to production through a merge operation.
Test - define pre-merge and pre-commit hooks to run tests that enforce schema and validate properties of the data to catch issues before they reach production.

In production

Roll back - recover from errors by instantly reverting data to a former, consistent snapshot of the data lake. Choose any commit in a repository’s commit history to revert in one atomic action.
Troubleshoot - investigate production errors by starting with a snapshot of the inputs to the failed process. Spend less time re-creating the state of datasets at the time of failure, and more time finding the solution.
Cross-collection consistency - provide consumers multiple synchronized collections of data in one atomic, revertable action. Using branches, writers provide consistency guarantees across different logical collections - merging to the main branch only after all relevant datasets have been created or updated successfully.

Source: https://docs.lakefs.io/#use-cases

Demo of the week: Running Trino on LakeFS

In order to run Trino and LakeFS, you need Docker installed on your system with at least 4GB of memory allocated to Docker.

Let’s start up the LakeFS instance and the required PostgreSQL instance along with the typical Trino containers used with the Hive connector. Clone the trino-getting-started repository and navigate to the community_tutorials/lakefs/trino-lakefs-minio/ directory.

git clone git@github.com:bitsondatadev/trino-getting-started.git

cd community_tutorials/lakefs/trino-lakefs-minio/

docker-compose up -d

Once this is done, you can navigate to the following locations to verify that everything started correctly.

Navigate to http://localhost:8000 to open the LakeFS user interface.
Log in with Access Key, AKIAIOSFODNN7EXAMPLE, and Secret Access Key, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY.
Verify that the example repository exists in the UI and open it.
The branch main in the repository, found under example/main/, should be empty.

Once you have verified the repository exists, let’s go ahead and create a schema under the Trino Hive catalog called minio that was pointing to minio but is now wrapped by LakeFS to add the git-like layer around the file storage.

Name the schema tiny as that is the schema we copy from the TPCH data set. Notice the location property of the schema. It now has a namespace that is prefixed before the actual tiny/ table directory. The prefix contains the repository name, then the branch name. All together this follows the pattern of <protocol>://<repository>/<branch>/<schema>/.

CREATE SCHEMA minio.tiny
WITH (location = 's3a://example/main/tiny');

Now, create two tables, customer and orders by setting external_location using the same namespace used in the schema and adding the table name. The query retrieves the data from the tiny TPCH data set.

CREATE TABLE minio.tiny.customer
WITH (
  format = 'ORC',
  external_location = 's3a://example/main/tiny/customer/'
) 
AS SELECT * FROM tpch.tiny.customer;

CREATE TABLE minio.tiny.orders
WITH (
  format = 'ORC',
  external_location = 's3a://example/main/tiny/orders/'
) 
AS SELECT * FROM tpch.tiny.orders;

Verify that you can see the table directories in LakeFS once they exist. http://localhost:8000/repositories/example/objects?ref=main&path=tiny%2F

Run a query on these two tables using the standard table pointing to the main branch.

SELECT ORDERKEY, ORDERDATE, SHIPPRIORITY
FROM minio.tiny.customer c, minio.tiny.orders o
WHERE MKTSEGMENT = 'BUILDING' AND c.CUSTKEY = o.CUSTKEY AND
ORDERDATE < date'1995-03-15'
GROUP BY ORDERKEY, ORDERDATE, SHIPPRIORITY
ORDER BY ORDERDATE;

Open the LakeFS UI again and click on the Unversioned Changes tab. Click Commit Changes. Type a commit message on the popup and click Commit Changes.

Once the changes are commited on branch main, click on the Branches tab. Click Create Branch. Name a new branch sandbox that branches off of the main branch. Now click Create.

Although there is a branch that exists called sandbox, this only exists logically. We need to make Trino aware by adding another schema and tables that point to the new branch. Do this by making a new schema called tiny_sandbox and changing the location property to point to the sandbox branch instead of the main branch.

CREATE SCHEMA minio.tiny_sandbox
WITH (location = 's3a://example/sandbox/tiny');

Once the tiny_sandbox schema exists, we can copy the table definitions of the customer and orders table from the original tables created. We got the schema for free by copying it directly from the TPCH data using the CTAS statement. We don’t want to use CTAS in this case as it not only copies the table definition, but also the data. This duplication of data is unnecessary and is what creating a branch in LakeFS avoids. We want to just copy the table definition using the SHOW CREATE TABLE statement.

SHOW CREATE TABLE minio.tiny.customer;
SHOW CREATE TABLE minio.tiny.orders;

Take the output and update the schema to tiny_sandbox and external_location to point to sandbox for both tables.

CREATE TABLE minio.tiny_sandbox.customer (
   custkey bigint,
   name varchar(25),
   address varchar(40),
   nationkey bigint,
   phone varchar(15),
   acctbal double,
   mktsegment varchar(10),
   comment varchar(117)
)
WITH (
   external_location = 's3a://example/sandbox/tiny/customer',
   format = 'ORC'
);

CREATE TABLE minio.tiny_sandbox.orders (
   orderkey bigint,
   custkey bigint,
   orderstatus varchar(1),
   totalprice double,
   orderdate date,
   orderpriority varchar(15),
   clerk varchar(15),
   shippriority integer,
   comment varchar(79)
)
WITH (
   external_location = 's3a://example/sandbox/tiny/orders',
   format = 'ORC'
);

Once these table definitions exist, go ahead and run the same query as before, but update using the tiny_sandbox schema instead of the tiny schema.

SELECT ORDERKEY, ORDERDATE, SHIPPRIORITY
FROM minio.tiny_sandbox.customer c, minio.tiny_sandbox.orders o
WHERE MKTSEGMENT = 'BUILDING' AND c.CUSTKEY = o.CUSTKEY AND
ORDERDATE < date'1995-03-15'
ORDER BY ORDERDATE;

One last bit of functionality we want to test is the merging capabilities. To do this, create a table called lineitem in the sandbox branch using a CTAS statement.

CREATE TABLE minio.tiny_sandbox.lineitem
WITH (
  format = 'ORC',
  external_location = 's3a://example/sandbox/tiny/lineitem/'
) 
AS SELECT * FROM tpch.tiny.lineitem;

Verify that you can see three table directories in LakeFS including lineitem in the sandbox branch. http://localhost:8000/repositories/example/objects?ref=sandbox&path=tiny%2F

Verify that you do not see lineitem in the table directories in LakeFS in the main branch. http://localhost:8000/repositories/example/objects?ref=main&path=tiny%2F

You can also verify this by running queries against lineitem in the tables pointing to the sandbox branch that should fail on the tables pointing to the main branch.

To merge the new table lineitem to show up in the main branch, first commit the new change to sandbox by again going to Unversioned Changes tab. Click Commit Changes. Type a commit message on the popup and click Commit Changes.

Once the lineitem add is committed, click on the Compare tab. Set the base branch to main and the compared to branch to sandbox. You should see the addition of a line item show up in the diff view. Click Merge and click Yes.

Once this is merged you should see the table data show up in LakeFS. Verify that you can see lineitem in the table directories in LakeFS in the main branch. http://localhost:8000/repositories/example/objects?ref=main&path=tiny%2F

As before, we won’t be able to query this data from Trino until we run the SHOW CREATE TABLE from the tiny_sandbox schema and use the output to create the table in the tiny schema that is pointing to main.

PR of the week: PR 8762 Add query error info to cluster overview page in web UI

The PR of the week adds some really useful context around query failures in the Trino Web UI. This PR was created by Pádraig O’Sullivan . For many, it can be fustrating when a query fails and you have to do a lot of digging before you understand even the type of error that is happening.This PR gives a better highlight of what failed so that you don’t have to do a lot of investigation upfront to get a sense of what is happening and where to look next.

Thank you so much Pádraig!

Question of the week: Why are deletes so limited in Trino?

Our question of the week comes from Marius Grama on our Trino community Slack. Marius created the dbt-trino adapter and wants to implement INSERT OVERWRITE functionality.

INSERT OVERWRITE checks whether there are entries in the target table that exist as well in the staging table, and it first deletes the target entries, before inserting the staging entries. Unfortunately the delete didn’t work for RDBMS, Hive, or Iceberg. His questionis if this is a limitation of Trino for all connectors, and how we can approach the “delete” part of INSERT OVERWRITE

Events, news, and various links

Blogs and Resources

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

26: Trino discovers data catalogs with Amundsen

2021-09-16T00:00:00+00:00

Guests

Mark Grover, Co-creator of Amundsen and Founder at Stemma (@mark_grover).

Release 362

Official announcement items from Martin is not yet available since release it not out… but soon.

Manfreds notes:

Add new listagg function contributed by Marius
Join performance and DISTINCT performance improvements
SQL security related changes in ALTER SCHEMA
Add IN table for CREATE/DROP/… ROLE
Whole bunch of improvements in the BigQuery connector
Numerous improvements for Parquet file usage in Hive connector
All connector docs now have SQL support section

Concept of the week: Data discovery and Amundsen

Data discovery is a process that aids in the analysis of data where siloed data has been centralized, and it is difficult to find data or overlap between disparate data sets. Many teams have their own view of the world when it comes to the data they need, but they commonly need to reason about how their data relates to data outside of their domain.

There are typically questions about who owns what data to help identify individuals responsible for maintaining the standards. Additionally, there are also issues around providing documentation around the data, and to identify who to call for help if there are issues using the data. This allows analysts to discover patterns in the data, and periodically audit the data storage practices. Interesting questions also arise around existing policies, and can encourage a system of record that act as a shared front end around their data policies.

What is Amundsen?

Amundsen provides data discovery by using ETL processes to scrape metadata from all of the data sources. It creates a central location to collect all that metadata and enables search and other analytics of this metadata. Here’s how the project describes itself on the Amundsen website:

Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data. It does that today by indexing data resources (tables, dashboards, streams, etc.) and powering a page-rank style search based on usage patterns (e.g. highly queried tables show up earlier than less queried tables).

Amundsen has an architecture that interacts primarily with information_schema tables, among other metadata, depending on the data source. In Trino’s case, the extractor used connects directly to the Hive metastore database, for Trino views, since they’re stored there. Physical tables use the HiveTableMetadataExtractor to load these tables into Amundsen. This makes sense since the data is stored in the Hive table format. For non-Hive use cases, you generally want to bypass using Trino (for now) and directly connect Amundsen to each data source.

Amundsen includes an ETL framework called databuilder that runs multiple jobs. Jobs contain an ETL task to extract the metadata and load it into the two databases that are central to Amundsen, Neo4j and Elasticsearch. Neo4j stores the core metadata that is represented on the UI. Elasticsearch enables search over the many fields in the metadata. Ingestion via ETL follows the following steps:

Ingest base data to Neo4j.
Ingest additional data and decorate Neo4j over base data.
Update Elasticsearch index using Neo4j data.
Remove stale data.

Each job contains an ETL task. The task must define an extractor and a loader, and optionally a translator. You can see example configurations for different extractors on the website, like the example for the HiveTableMetadataExtractor.

The metadata is modeled using a graph representation in neo4j and optionally Apache Atlas to model advanced concepts, such as, lineage and other relations.

You can learn more about the models in the metadata here.

Amundsen resources

Docs: https://www.amundsen.io/amundsen/
GitHub: https://github.com/amundsen-io/amundsen
YouTube: https://www.youtube.com/playlist?list=PL0UJdxehTNlKnGU_h7k2fzJyvAiufeh1U
Slack: Join

Amundsen as a subcomponent to data mesh

A new architecture, philosophy, and yes, buzzword that is gaining momentum is the data mesh. While it certainly still not concretely defined, it is in the research and development phase. Data mesh is gaining a lot of attention as a potential alternative to data lakes and data warehouses for analytics solutions.

Data mesh mirrors the philosophy of microservice architecture. It argues that data should be defined and maintained by teams responsible for their business domain similar to how the responsibility is delegated at the service layer. Since not everyone is going to be a data engineer on the domain team, there must be some consideration for the architecture of such a platform. The author of this paradigm, Zhamak Dehghani, lays out 4 principles that characterize a data mesh. Below are the principles of a Data mesh. Below the systems that provide some or all of the solution for a principle are listed in parentheses.

Domain-oriented decentralized data ownership and architecture (Trino & Amundsen)
Data as a product (Amundsen)
Self-serve data infrastructure as a platform (Trino)
Federated computational governance (Amundsen to some extent)

Stemma

Like with many successful open source projects, there are enterprise products that build on and support the open source project. Stemma is the enterprise company that supports Amundsen. It’s founded by Mark and others central to the open source project.

PR of the week: Index Trino views

The PR (or should we say commit) of the week, adds the original Trino extractor. As mentioned above this extractor is only needed for views as the physical tables exist in Hive and are retrieved.

Call to contribute to Amundsen

If you want to help out, you can consider adding the Trino image similar to this commit completed a while back.

Demo: Extracting metadata from Hive metastore and loading it into Amundsen

There were technical difficulties on the day of broadcasting the show, so the demo was moved to its own separate video.

The steps in this demo are adapted from the Amundsen installation page. Clone this repository and navigate to the trino-getting-started/community_tutorials/amundsen directory. For this demo you need at least 3GB of memory allocated to your Docker application.

git clone git@github.com:bitsondatadev/trino-getting-started.git

cd community_tutorials/amundsen

docker-compose up -d

Once all the services are running, clone the Amundsen repository in a separate terminal. Then navigate to the databuilder folder and install all the dependencies:

git clone --recursive https://github.com/amundsen-io/amundsen.git
cd databuilder
python3 -m venv venv
source venv/bin/activate
pip3 install --upgrade pip
pip3 install -r requirements.txt
python3 setup.py install

Navigate to MinIO at http://localhost:9000 to create the tiny bucket for the schema in Trino to map to. In Trino, create a schema and a couple tables in the existing minio catalog:

CREATE SCHEMA minio.tiny
WITH (location = 's3a://tiny/');

CREATE TABLE minio.tiny.customer
WITH (
  format = 'ORC',
  external_location = 's3a://tiny/customer/'
) 
AS SELECT * FROM tpch.tiny.customer;

CREATE TABLE minio.tiny.orders
WITH (
  format = 'ORC',
  external_location = 's3a://tiny/orders/'
) 
AS SELECT * FROM tpch.tiny.orders;

Navigate back to the trino-getting-started/community_tutorials/amundsen directory in the same Python virtual environment you just opened.

cd trino-getting-started/community_tutorials/amundsen
python3 assets/scripts/sample_trino_data_loader.py

View the Amundsen UI at http://localhost:5000 and try to search test, it should return the tables you just created.

You can verify dummy data has been ingested into Neo4j by visiting http://localhost:7474/browser/. Log in as neo4j with the test password and run MATCH (n:Table) RETURN n LIMIT 25 in the query box. You should see few tables.

If you have any issues, look at some of the troubleshooting steps in the Amundsen installation page.

Question of the week: Can I add a UDF without restarting Trino?

This weeks question of the week comes in from the Trino Slack from Chen Xuying.

Is there any way to register a new user defined function (UDF) and needn’t restart coordinator and worker?

Currently, no. In Java, jar files and all the java code is loaded up on start time. So in order to load the files on all the worker nodes and coordinator, you need to restart. There are various ways for UDFs to be implemented in a dynamic way so we are still looking for a suggestion here.

One option, as Manfred mentions, would be to load Javascript as a UDF as Java allows to compile Javascript. This would allow for new functions to be added without restart. There may be other ways to acheive and we invite you to contribute your ideas!

Events, news, and various links

Blogs

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

25: Trino going through changes

2021-09-02T00:00:00+00:00

Guests

Ayush Chauhan, Data Platform Engineer at Zomato (Ayush Chauhan).
Gunnar Morling, Lead of Debezium and Open source software engineer at Red Hat (@gunnarmorling).
Ashhar Hasan, Software Engineer at Starburst (@hashhar).

Release 361

Official announcement items from Martin:

Support for OAuth2/OIDC opaque access tokens
Aggregation pushdown for Pinot
Better performance for Parquet files with column indexes
Support for reading fields as JSON values in Elasticsearch

Manfred’s additional notes:

Predicate pushdown in Cassandra
Metadata cache size limitation in a few connectors
Lots of improvements for Hive view support
Glue table statistics improvements

More info at https://trino.io/docs/current/release/release-361.html.

Concept of the week: Change Data Capture

If you know Trino, you know it allows for flexible architectures that include many systems with varying use cases they support. We’ve come to accept this potpourri of systems as a general modus operandi for most businesses.

Many times the data gets copied to different systems to accomplish varying use cases from performance and data warehousing to merge cross cutting data into a single store. When copying data between systems, how do these systems stay in sync? It’s a critical need especially for Trino to know that the state across the data sources we query is valid.

To answer this, we can use the concept of Change Data Capture (CDC). CDC is a powerful concept that considers a data source(s), called a systems of record(s), that store the true state of a system. The systems of records are monitored for changes, and upon detecting changes, the CDC system propogates changes to a number of target systems.

Change Data Capture: Source.

Debezium for CDC

One implemention of CDC that has grown tremendously in popularity since its inception is called Debezium. According to https://debezium.io:

Debezium is an open-source distributed platform for change data capture. Start it up, point it at your databases, and your apps can start responding to all of the inserts, updates, and deletes that other apps commit to your databases. Debezium is durable and fast, so your apps can respond quickly and never miss an event, even when things go wrong.

The common way Debezium is deployed in the wild is using [Kafka Connect(https://docs.confluent.io/platform/current/connect/index.html) and defining the Debezium source connectors. You can then use the Kafka Connect ecosystem to create to different targets downstream.

The Debezium architecture with Kafka Connect: Source.

Another alternative, if you don’t want to use Kafka, is to use dedicated Debezium servers to implement CDC and push the logs to the target database downstram using Debezium connectors.

The Debezium standalone server architecture: Source.

While CDC is the primary focus, Debezium also provides support for more advanced concepts such as the outbox pattern support for Quarkus apps.

Debezium + Trino at Zomato

Zomato is a technology platform that connects customers, restaurant partners and delivery partners, serving their multiple needs. Customers use their platform to search and discover restaurants, read and write customer generated reviews and view and upload photos, order food delivery, book a table and make payments while dining-out at restaurants. Clearly there’s a lot of data that can flow through a platform like this. You’ll have both operational databases to support the applications in this platform, but also need big data stores to store and analyze all of this data.

Here is one of the earlier iterations of Zomato’s big data architecture before they were able to integrate Debezium. Ayush covers some of the pain points they experienced before implementing CDC.

Once Zomato implemented CDC, they were able to keep their downstream Iceberg stores in sync across multiple operational systems. As a result the analytics data is now much more dependable.

PR of the week: PR 4140 Implement aggregation pushdown in Pinot

The PR of the week is actually a throwback to episode thirteen, Trino takes a sip of Pinot, where our guest Elon Azoulay discussed some of the upcoming features coming to the Pinot connector were. Push down aggregates was on that list and this just landed in the 361 release!

This PR implements aggregation pushdown for COUNT, AVG, MIN, MAX, SUM, COUNT(DISTINCT) and approx_distinct. It is enabled by default and can be disabled using the configuration property pinot.aggregation-pushdown.enabled or the catalog session property aggregation_pushdown_enabled.

FYI: https://github.com/trinodb/trino/pull/9208

Thanks Elon!

Question of the week: Is there an array function that flattens a row like `1 | [a, b, c]` into three rows?

Our question of the week comes from Brian Hudson on our Trino community Slack. Brian is dealing with an ARRAY type in one column and an INTEGER column in another. This is common when processing nested denormalized data. The goal is to make this row 1 | [a, b, c], split the array into three rows.

| a
| b
| c

Kasia answered this question by using the UNNEST on the array column. This UNNEST statement produces a single column of the size of the array and a JOIN is performed with the original INTEGER column.

WITH t(x, y) AS (VALUES (1, ARRAY['a', 'b', 'c']))
SELECT x, y_unnested
FROM t
LEFT JOIN UNNEST (t.y) t2(y_unnested) ON true;

trino> WITH t(x, y) AS (VALUES (1, ARRAY['a', 'b', 'c']))
     -> SELECT x, y_unnested
     -> FROM t
     -> LEFT JOIN UNNEST (t.y) t2(y_unnested) ON true;
 x | y_unnested
---+------------
 1 | a
 1 | b
 1 | c
(3 rows)

Events, news, and various links

Blogs and Resources

Videos

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

24: Trinetes I: Trino on Kubernetes

2021-08-19T00:00:00+00:00

This is the first episode in a series where we cover the basics and just enough advanced Kubernetes features and information to understand how to deploy Trino on Kubernetes.

Concept of the week: K8s architecture: Containers, Pods, and kubelets

For this concept of the week, we want to provide you a minimalistic overview of what you need to know about Kubernetes to deploy Trino to a cluster.

Why Kubernetes? Kubernetes is a container orchestration platform that allows you to indicate how to manage containers declaritively using yaml configuration files. This definition can be tricky to understand if you don’t have proper context. To make sure nobody is left behind, it is useful to cover what containers are:
- The traditional way to deploy an application is to take the compiled binary of that application and run it directly on computer hardware that has an operating system to run the application on it. This works, but has a lot of dependency on the underlying hardware and operating system to be functional and requires multiple applications to share the same resources. If one of the applications fails and causes any of the shared resources to crash, it could cause all applications to fail on that machine.
- To remove these dependencies, engineers created virtual machines (VMs) by using a VM manager called the hypervisor that emulate hardware environments to host other operating systems. This is a big step forward as now each application can be isolated, but it comes at a great cost. Each virtual machine hosts an entire operating system and is resource intensive and slow.
- Containers are the newest type of deployment. Containers enable a logical isolation of resources while still physically running on shared resources. All resources created in the hardware and operating systems exist on the host system. The isolation restricts any interference from other processes. Containers achieve the goals of virtualization without sacrificing much performance or efficiency.
Source: https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/
- Containerization simplified a trend in service oriented architecture called microservices. Microservices deploy loosely coupled and modular applications rather than all-encompassing monolithic applications. With containers, these applications can be deployed and scaled up quickly across various virtual and physical machines without affecting other applications on the same machine. This is great, but results in new complexities. Some examples are the need for new approaches to monitoring the health of applications, scaling the applications as requests grow and diminish, redeploying crashed applications, and networking the applications together. In summary, all of these activities can be considered container orchestration and this is exactly what Kubernetes solves!
Source: https://www.slideshare.net/devopsdaysaustin/continuously-delivering-microservices-in-kubernetes-using-jenkins
Here we hae two services that each sit behind a load balancer provided and mapped by the Kuberenets cluster.
Kubernetes components and architecture:
- Node - The physical machine or VM running a kubelet and container runtime.
- Control Plane - The container orchestration layer that exposes the API and interfaces to define, deploy, and manage the lifecycle of containers.
- Cluster - a set of nodes connected to the same control plane.
- Pod - single instance of an application, the smallest object in kubernetes.
Source: https://kubernetes.io/docs/concepts/overview/components/

Kubernetes control plane components:

API server that nodes connect with and is the front end for users and administrators of the cluster.
etcd keystore is a distributed store containing all data used to manage the cluster
Scheduler that distributes work across nodes and assigns newly created containers to nodes
Controllers that are the brain behind orchestration and monitors for nodes going down etc…

Kubernetes worker node components:

container runtime - underlying runtime used to manage containers
kubelet - agent that checks the health and manages the pods running on the node based on the desired state provided in the PodSpec
kube-proxy - network proxy that maintains network rules applied to nodes and allows network access between Pods in a cluster

You can scale up multiple pods on a single node until the node has no more resources, at which time a new node needs to be added and pod instances are distributed between the nodes.

So how does this relate to Trino?

Out of the box, Kubernetes can do these key things for Trino.
- Simple scale up and down (manually tell k8s to start or kill Trino pods).
- Kubernetes supports failover, meaning that your workers will restart if they die.
Advanced jobs that could exist but not currently in open source.
- Auto-scaling via the Horizontal Pod Autoscaler and custom metrics.
- Graceful Shutdowns are hooks that you can add into your cluster that wait to shut down to avoid a failed call to a node that already shut down.
Source: https://learnk8s.io/graceful-shutdown

Source: https://learnk8s.io/graceful-shutdown

What the heck are helm charts then?

Helm is package manager for Kubernetes
Removes the need for managing lots of Kubernetes related yaml files
Best way to deploy apps to Kubernetes
Charts are available for many different applications
Helm chart for Trino

PR of the week: PR 11 Merge contributor version of k8s charts with the community version

This weeks PR of the week comes from a different repo under the trinodb org, trinodb/charts. This PR contains the merging from contributor Valeriano Manassero.

Valerino maintains a very useful helm chart, that started before the Trino org had defined our own community chart. This pull request effectively is trying to merge some useful features Valeriano added to his Trino helm chart so that it can be maintained in the community version.

Valeriano’s Trino Helm Chart: https://artifacthub.io/packages/helm/valeriano-manassero/trino

It hasn’t been merged yet but we are really looking forward to seeing this get merged in. Thanks Valeriano!

Demo: Running the Trino charts with kubectl

For this weeks demo, you need to install kubectl, minikube using the docker driver, and helm. You can find the trino helm chart on ArtifactHub at this URL.

https://artifacthub.io/packages/helm/trino/trino

First, start your minikube instance.

minikube start --driver=docker

Now take a quick look at the state of your k8s cluster.

kubectl get all

Add the template for the different trino catalogs on coordinators and workers.

kubectl apply -f - <<EOF
# Source: trino/templates/configmap-catalog.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: tcb-trino-catalog
  labels:
    app: trino
    chart: trino-0.2.0
    release: tcb
    heritage: Helm
    role: catalogs
data:
  tpch.properties: |
    connector.name=tpch
    tpch.splits-per-node=4
  tpcds.properties: |
    connector.name=tpcds
    tpcds.splits-per-node=4
EOF

Add the template for a single coordinator configuration.

kubectl apply -f - <<EOF
# Source: trino/templates/configmap-coordinator.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: tcb-trino-coordinator
  labels:
    app: trino
    chart: trino-0.2.0
    release: tcb
    heritage: Helm
    component: coordinator
data:
  node.properties: |
    node.environment=production
    node.data-dir=/data/trino
    plugin.dir=/usr/lib/trino/plugin

  jvm.config: |
    -server
    -Xmx8G
    -XX:+UseG1GC
    -XX:G1HeapRegionSize=32M
    -XX:+UseGCOverheadLimit
    -XX:+ExplicitGCInvokesConcurrent
    -XX:+HeapDumpOnOutOfMemoryError
    -XX:+ExitOnOutOfMemoryError
    -Djdk.attach.allowAttachSelf=true
    -XX:-UseBiasedLocking
    -XX:ReservedCodeCacheSize=512M
    -XX:PerMethodRecompilationCutoff=10000
    -XX:PerBytecodeRecompilationCutoff=10000
    -Djdk.nio.maxCachedBufferSize=2000000

  config.properties: |
    coordinator=true
    node-scheduler.include-coordinator=true
    http-server.http.port=8080
    query.max-memory=4GB
    query.max-memory-per-node=1GB
    query.max-total-memory-per-node=2GB
    memory.heap-headroom-per-node=1GB
    discovery-server.enabled=true
    discovery.uri=http://localhost:8080

  log.properties: |
    io.trino=INFO
EOF

Add the tcb-trino service definition to run Trino.

kubectl apply -f - <<EOF
# Source: trino/templates/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: tcb-trino
  labels:
    app: trino
    chart: trino-0.2.0
    release: tcb
    heritage: Helm
spec:
  type: ClusterIP
  ports:
    - port: 8080
      targetPort: http
      protocol: TCP
      name: http
  selector:
    app: trino
    release: tcb
    component: coordinator
EOF

Add the deployment definition for the service.

kubectl apply -f - <<EOF
# Source: trino/templates/deployment-coordinator.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tcb-trino-coordinator
  labels:
    app: trino
    chart: trino-0.2.0
    release: tcb
    heritage: Helm
    component: coordinator
spec:
  selector:
    matchLabels:
      app: trino
      release: tcb
      component: coordinator
  template:
    metadata:
      labels:
        app: trino
        release: tcb
        component: coordinator
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
      volumes:
        - name: config-volume
          configMap:
            name: tcb-trino-coordinator
        - name: catalog-volume
          configMap:
            name: tcb-trino-catalog
      imagePullSecrets:
        - name: registry-credentials
      containers:
        - name: trino-coordinator
          image: "trinodb/trino:latest"
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - mountPath: /etc/trino
              name: config-volume
            - mountPath: /etc/trino/catalog
              name: catalog-volume
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          livenessProbe:
            httpGet:
              path: /v1/info
              port: http
          readinessProbe:
            httpGet:
              path: /v1/info
              port: http
          resources:
            {}
EOF

Now check the state of the k8s cluster again.

kubectl get all

Run the following command to expose the url and port to the localhost system.

minikube service tcb-trino --url

Clean up all the resources.

kubectl delete pod --all
kubectl delete replicaset --all
kubectl delete service tcb-trino
kubectl delete deployment tcb-trino-coordinator
kubectl delete configmap --all

Now you can run the same demo using the helm chart which includes all of these templates out-of-the-box. First add the trino helm chart, check the templates that are produced by helm, and run the install.

# HELM DEMO

helm repo add trino https://trinodb.github.io/charts/

helm template tcb trino/trino --version 0.2.0

helm install tcb trino/trino --version 0.2.0

Now that it’s installed, run the same command to expose the url of the service.

minikube service tcb-trino --url

Clean up all the resources.

minikube delete
helm repo remove trino

Events, news, and various links

Trino Summit is moving to 100% virtual: register here.

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

23: Trino looking for patterns

2021-08-02T00:00:00+00:00

Guests

Kasia Findeisen, Software Engineer at Starburst (@kasiafi).

Release 360

In our last episode we already had a bit of a glimpse. Now the release is really out.

Official announcement items from Martin:

Automatic configuration of TLS for internal communication.
Improved correlated subqueries with GROUP BY or LIMIT.
Support for assuming an IAM role in Elasticsearch connector.
Support for Trino views in Iceberg connector.

Manfred’s additional notes:

Documentation for materialized views SQL commands
Partial support for DELETE and batch insert support for various JDBC-based connectors
A bunch of performance and correctness fixes
Numerous improvements on Iceberg connector

More info at https://trino.io/docs/current/release/release-360.html.

Concept of the week: Row pattern matching and MATCH_RECOGNIZE

The MATCH_RECOGNIZE syntax was introduced in the latest SQL specification of 2016. It is a super powerful tool for analyzing trends in your data. We are proud to announce that Trino supports this great feature since version 356. With MATCH_RECOGNIZE, you can define a pattern using the well-known regular expression syntax, and match it to a set of rows. Upon finding a matching row sequence, you can retrieve all kinds of detailed or summary information about the match, and pass it on to be processed by the subsequent parts of your query. This is a new level of what a pure SQL statement can do.

For more details, this blog post gives you a taste of row pattern matching capabilities, and a quick overview of the MATCH_RECOGNIZE syntax.

Let’s look at an example with data similar to the TPCH data. Here is an example, and the same goal: detect a “V”-shape of the price values over time for different customers.

trino> WITH orders(customer_id, order_date, price) AS (VALUES
    ('cust_1', DATE '2020-05-11', 100),
    ('cust_1', DATE '2020-05-12', 200),
    ('cust_2', DATE '2020-05-13',   8),
    ('cust_1', DATE '2020-05-14', 100),
    ('cust_2', DATE '2020-05-15',   4),
    ('cust_1', DATE '2020-05-16',  50),
    ('cust_1', DATE '2020-05-17', 100),
    ('cust_2', DATE '2020-05-18',   6))
SELECT customer_id, start_price, bottom_price, final_price, start_date, final_date
    FROM orders
        MATCH_RECOGNIZE (
            PARTITION BY customer_id
            ORDER BY order_date
            MEASURES
                START.price AS start_price,
                LAST(DOWN.price) AS bottom_price,
                LAST(UP.price) AS final_price,
                START.order_date AS start_date,
                LAST(UP.order_date) AS final_date
            ONE ROW PER MATCH
            AFTER MATCH SKIP PAST LAST ROW
            PATTERN (START DOWN+ UP+)
            DEFINE
                DOWN AS price < PREV(price),
                UP AS price > PREV(price)
            );

 customer_id | start_price | bottom_price | final_price | start_date | final_date
-------------+-------------+--------------+-------------+------------+------------
 cust_1      |         200 |           50 |         100 | 2020-05-12 | 2020-05-17
 cust_2      |           8 |            4 |           6 | 2020-05-13 | 2020-05-18
(2 rows)

Two matches are detected, one for cust_1, and one for cust_2.

The matching algorithm was a collaboration between Martin and Kasia. This algorithm lives in the Matcher class.

The running semantics is the default both in the DEFINE and MESAURES clauses. Note that FINAL only applies to the MEASURES clause.

To sum up, here’s one complex measure expression combining different elements of the special syntax:

PR of the week: PR 8348 Document row pattern recognition in window

The PR of the week, is adding documentation for applying pattern matching over windows. This is yet another SQL functionality that Kasia added after getting the patter recognition to work with MATCH_RECOGNIZE.

Demo: Showing MATCH_RECOGNIZE functionality by example

Here are a few examples that Kasia will be running:

Demo preview:

The initial query. That’s mostly the same query that’s in the blog post, the differences being:

Usage of a real table instead of a CTE.
Additional sort key for consistent ordering
Two more measures

 SELECT custkey, match_no, start_price, bottom_price, final_price, start_date, final_date, classy
               FROM orders
                   MATCH_RECOGNIZE (
                       PARTITION BY custkey
                       ORDER BY orderdate, orderkey
                       MEASURES
                           START.totalprice AS start_price,
                           LAST(DOWN.totalprice) AS bottom_price,
                           LAST(UP.totalprice) AS final_price,
                           START.orderdate AS start_date,
                           LAST(UP.orderdate) AS final_date,
                           MATCH_NUMBER() AS match_no,
                           CLASSIFIER() AS classy
                       ONE ROW PER MATCH
                       AFTER MATCH SKIP PAST LAST ROW
                       PATTERN (START DOWN+ UP+)
                       DEFINE
                           DOWN AS totalprice < PREV(totalprice),
                           UP AS totalprice > PREV(totalprice)
                       )

The query returns many results (many matches). Wrap it in a count() aggregation to check how many there are:

 SELECT count() FROM (SELECT custkey, match_no, start_price, bottom_price, final_price, start_date, final_date, classy
               FROM orders
                   MATCH_RECOGNIZE (
                       PARTITION BY custkey
                       ORDER BY orderdate, orderkey
                       MEASURES
                           START.totalprice AS start_price,
                           LAST(DOWN.totalprice) AS bottom_price,
                           LAST(UP.totalprice) AS final_price,
                           START.orderdate AS start_date,
                           LAST(UP.orderdate) AS final_date,
                           MATCH_NUMBER() AS match_no,
                           CLASSIFIER() AS classy
                       ONE ROW PER MATCH
                       AFTER MATCH SKIP PAST LAST ROW
                       PATTERN (START DOWN+ UP+)
                       DEFINE
                           DOWN AS totalprice < PREV(totalprice),
                           UP AS totalprice > PREV(totalprice)
                       ))

Modify the PATTERN to limit the results. Now searching for a “big V”:

 SELECT count() FROM (SELECT custkey, match_no, start_price, bottom_price, final_price, start_date, final_date, classy
               FROM orders
                   MATCH_RECOGNIZE (
                       PARTITION BY custkey
                       ORDER BY orderdate, orderkey
                       MEASURES
                           START.totalprice AS start_price,
                           LAST(DOWN.totalprice) AS bottom_price,
                           LAST(UP.totalprice) AS final_price,
                           START.orderdate AS start_date,
                           LAST(UP.orderdate) AS final_date,
                           MATCH_NUMBER() AS match_no,
                           CLASSIFIER() AS classy
                       ONE ROW PER MATCH
                       AFTER MATCH SKIP PAST LAST ROW
                       PATTERN (START DOWN{3,} UP{4,})
                       DEFINE
                           DOWN AS totalprice < PREV(totalprice),
                           UP AS totalprice > PREV(totalprice)
                       ))

Unwrap from count() aggregation to see the actual matches:

 SELECT custkey, match_no, start_price, bottom_price, final_price, start_date, final_date, classy
               FROM orders
                   MATCH_RECOGNIZE (
                       PARTITION BY custkey
                       ORDER BY orderdate, orderkey
                       MEASURES
                           START.totalprice AS start_price,
                           LAST(DOWN.totalprice) AS bottom_price,
                           LAST(UP.totalprice) AS final_price,
                           START.orderdate AS start_date,
                           LAST(UP.orderdate) AS final_date,
                           MATCH_NUMBER() AS match_no,
                           CLASSIFIER() AS classy
                       ONE ROW PER MATCH
                       AFTER MATCH SKIP PAST LAST ROW
                       PATTERN (START DOWN{3,} UP{4,})
                       DEFINE
                           DOWN AS totalprice < PREV(totalprice),
                           UP AS totalprice > PREV(totalprice)
                       )

Change AFTER MATCH SKIP PAST LAST ROW to AFTER MATCH SKIP TO NEXT ROW to detect overlapping matches:

 SELECT custkey, match_no, start_price, bottom_price, final_price, start_date, final_date, classy
               FROM orders
                   MATCH_RECOGNIZE (
                       PARTITION BY custkey
                       ORDER BY orderdate, orderkey
                       MEASURES
                           START.totalprice AS start_price,
                           LAST(DOWN.totalprice) AS bottom_price,
                           LAST(UP.totalprice) AS final_price,
                           START.orderdate AS start_date,
                           LAST(UP.orderdate) AS final_date,
                           MATCH_NUMBER() AS match_no,
                           CLASSIFIER() AS classy
                       ONE ROW PER MATCH
                       AFTER MATCH SKIP TO NEXT ROW
                       PATTERN (START DOWN{3,} UP{4,})
                       DEFINE
                           DOWN AS totalprice < PREV(totalprice),
                           UP AS totalprice > PREV(totalprice)
                       )

Change ONE ROW PER MATCH to ALL ROWS PER MATCH (also, revert the previous change). Discuss the classy column and explain the running semantics on the example of final_date column:

 SELECT custkey, match_no, start_price, bottom_price, final_price, start_date, final_date, classy
               FROM orders
                   MATCH_RECOGNIZE (
                       PARTITION BY custkey
                       ORDER BY orderdate, orderkey
                       MEASURES
                           START.totalprice AS start_price,
                           LAST(DOWN.totalprice) AS bottom_price,
                           LAST(UP.totalprice) AS final_price,
                           START.orderdate AS start_date,
                           LAST(UP.orderdate) AS final_date,
                           MATCH_NUMBER() AS match_no,
                           CLASSIFIER() AS classy
                       ALL ROWS PER MATCH
                       AFTER MATCH SKIP PAST LAST ROW
                       PATTERN (START DOWN{3,} UP{4,})
                       DEFINE
                           DOWN AS totalprice < PREV(totalprice),
                           UP AS totalprice > PREV(totalprice)
                       )

Change the semantics of the final_date column to FINAL:

 SELECT custkey, match_no, start_price, bottom_price, final_price, start_date, final_date, classy
               FROM orders
                   MATCH_RECOGNIZE (
                       PARTITION BY custkey
                       ORDER BY orderdate, orderkey
                       MEASURES
                           START.totalprice AS start_price,
                           LAST(DOWN.totalprice) AS bottom_price,
                           LAST(UP.totalprice) AS final_price,
                           START.orderdate AS start_date,
                           FINAL LAST(UP.orderdate) AS final_date,
                           MATCH_NUMBER() AS match_no,
                           CLASSIFIER() AS classy
                       ALL ROWS PER MATCH
                       AFTER MATCH SKIP PAST LAST ROW
                       PATTERN (START DOWN{3,} UP{4,})
                       DEFINE
                           DOWN AS totalprice < PREV(totalprice),
                           UP AS totalprice > PREV(totalprice)
                       )

Question of the week: How do you tag a list of rows with custom periodic rules?

A StackOverflow user asked how to tag orders in a table that meet a certain criterion that relies on periodicity. There are certainly some complicated and inefficient SQL queries that you could craft to address these issues. However, now with MATCH_RECOGNIZE it is possible to do this and take advantage of the efficient matching capabilities that Martin and Kasia have added.

Here is an example orders table represented as a csv table:

Create_time, Order_id, person_id, variable_a
'2021-06-01', 1234, 2232, 1
'2021-06-02', 1235, 2232, 0.6
'2021-06-03', 1236, 2232, 0.33
'2021-06-04', 1237, 2232, 0.7
'2021-06-05', 1238, 2232, 0.6
'2021-06-06', 1239, 2232, 0.4
'2021-06-07', 1240, 2232, 0.8
'2021-06-08', 1241, 2232, 0.7
'2021-06-09', 1242, 2232, 0.4
'2021-06-10', 1243, 2232, 0.6
'2021-06-11', 1244, 2232, 0.7
'2021-06-12', 1245, 2232, 0.6

The grace period logic will produce the final_hit column as the result of this logic:

The is_hit column equals to 1 if the variable A less than equal to 0.5
There is a grace period totaling 4 Orders after the hit, so any hit that is within the grace period will be ignored. The resulting row can be called final_hit.

Based on this logic, this is the desired result of the example is:

Create_time, Order_id, person_id, variable_a, is_hit, final_hit
'2021-06-01', 1234, 2232, 1, NULL, NULL
'2021-06-02', 1235, 2232, 0.6, NULL, NULL
'2021-06-03', 1236, 2232, 0.33, true, true
'2021-06-04', 1237, 2232, 0.7, NULL, NULL
'2021-06-05', 1238, 2232, 0.6, NULL, NULL
'2021-06-06', 1239, 2232, 0.4, true, NULL
'2021-06-07', 1240, 2232, 0.8, NULL, NULL
'2021-06-08', 1241, 2232, 0.7, NULL, NULL
'2021-06-09', 1242, 2232, 0.4, true, true
'2021-06-10', 1243, 2232, 0.6, NULL, NULL
'2021-06-11', 1244, 2232, 0.7, NULL, NULL
'2021-06-12', 1245, 2232, 0.6, NULL, NULL

To accomplish this with MATCH_RECOGNIZE, you can do the following statement, which gives us the correct answer:

WITH data(Create_time, Order_id, person_id, variable_a) AS (
    VALUES
      (DATE '2021-06-01', 1234, 2232, 1),
      (DATE '2021-06-02', 1235, 2232, 0.6),
      (DATE '2021-06-03', 1236, 2232, 0.33),
      (DATE '2021-06-04', 1237, 2232, 0.7),
      (DATE '2021-06-05', 1238, 2232, 0.6),
      (DATE '2021-06-06', 1239, 2232, 0.4),
      (DATE '2021-06-07', 1240, 2232, 0.8),
      (DATE '2021-06-08', 1241, 2232, 0.7),
      (DATE '2021-06-09', 1242, 2232, 0.4),
      (DATE '2021-06-10', 1243, 2232, 0.6),
      (DATE '2021-06-11', 1244, 2232, 0.7),
      (DATE '2021-06-12', 1245, 2232, 0.6)
)
SELECT Create_time, Order_id, person_id, variable_a, if(variable_a <= 0.5, true, null) is_hit, final_hit
FROM data
   MATCH_RECOGNIZE (
     PARTITION BY person_id
     ORDER BY Create_time
     MEASURES if(classifier() = 'HIT', true, null) AS final_hit
     ALL ROWS PER MATCH WITH UNMATCHED ROWS
     AFTER MATCH SKIP PAST LAST ROW
     PATTERN (HIT G{,4})
     DEFINE /* G -- grace period */
            HIT AS HIT.variable_a <= 0.5
  )

Check out Martin and Kasia’s full answer to this question.

Events, news, and various links

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

22: TrinkedIn: LinkedIn gets a Trino promotion

2021-07-22T00:00:00+00:00

Commander Bun Bun, landing the job!

Guests

Akshay Rai, Staff Software Engineer at LinkedIn (@akshayrai09)
Jithesh Rajan, Staff Software Engineer at LinkedIn (@jithesh-tr-a3185b20)
Laura Chen, Staff Software Engineer at LinkedIn (@laura-yu-chen-3a75413)
Pratham Desai, Software Engineer at LinkedIn (@pratham-desai)
Raju Nalli, Staff Site Reliability Engineer at LinkedIn (@rajunalli)

Upcoming release and Trino Summit

Sneak peek items for 360

Automatic cluster internal TLS
Views support in Iceberg connector
Documentation for materialized views SQL commands
DELETE and batch insert support for various JDBC-based connectors

Trino Summit 2021

Get excited for this year’s Trino Summit hosted by Starburst. Registration and call for papers is now open!

LinkedIn is hiring!

Concept of the week: Trino at LinkedIn

The LinkedIn team covers the concept of the week in this section.

PR of the week: Digging into join queries

Today our PR of the week is from the future 🔮! LinkedIn is currently investigating the issue. This gives us a chance to talk about the research aspects that go into a PR.

With a view V that performs a UNION ALL from an old table O and a new migrated table N. For datepartition values older than D (say 2021-06-05), table O will be referred for data, while for date equal to or greater than D, data from N will be used.

The query in question is:

SELECT * FROM V
WHERE x IN (SELECT x2 FROM Z)
AND cast(substring(datepartition,1,10) as date) >= date('2021-06-08')

Here, table Z has stats available and only have 17 rows in them. While the data from view V (which is entirely from underlying table N for this query) has say billions of rows.

This query used to take about 39 seconds to run before our upgrade (PrestoSQL-333). After the upgrade (Trino-352) it increased to approximately thirty-five minutes.

Question of the week: How can I query the Hive views from Trino?

We actually covered the answer in episode 18.

You can use the Coral project that allows for translation between different SQL syntax. For example, it processes Hive QL statements and convert them to an internal representation using Apache Calcite. It then converts the internal representation to Trino SQL. See the docs for more details.

This diagram shows the creation of a Hive view, then shows the sequence of events when Trino reads that view.

Events, news, and various links

Blogs:

News

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

21: Trino + dbt = a match made in SQL heaven?

2021-07-08T00:00:00+00:00

Guests

Amy Chen, Partner Solutions Architect at dbt Labs (formerly Fishtown Analytics) (@yuanamychen)
Victor Coustenoble, Solutions Architect at Starburst (@victorcouste)

Release 359

Martin:

Row pattern recognition for window functions
Support for SET TIME ZONE
Support for timestamp(n) with precision higher than 3 in MySQL
ARM64-compatible docker image
Support for granting UPDATE privilege

Manfred:

SET TIME ZONE is a feature from our guest Marius from last time!
ARM64 compatible docker image as well as already existing tar.gz and rpm means usage of Graviton and other ARM64 processors is now available also for Kubernetes users, there are significant cost/performance benefits, try it out
wow .. this time it took a whole month from 358 to 359
breaking change - need Java 11.0.11
more materialized view stuff, and I am working on docs!
Fix handling of multiple LDAP user bind patterns - for those of us in larger orgs..
network logging in CLI
rename connector.name from hive-hadoop2 to hive

More info at https://trino.io/docs/current/release/release-359.html.

Question of the week: Can dbt connect to different databases in the same project?

This week we are going a little out of order from our usual sequence on this show. The question really gets to the heart of the concept of the week. We’ll cover this first then jump into the concept.

This question was asked on StackOverflow:

It seems dbt only works for a single database. If my data is in a different database, will that still work? For example, if my datalake is using delta, but I want to run dbt using Redshift, would dbt still work for this case?

Our guest Victor replied:

You can use Trino with dbt to connect to multiple databases in the same project.

The GitHub example project https://github.com/victorcouste/trino-dbt-demo contains a fully working setup, that you can replicate and adapt to your needs.

Concept of the week:

What is dbt?

dbt is a transformation workflow tool that lets teams quickly and collaboratively deploy analytics code, following software engineering best practices like modularity, CI/CD, testing, and documentation. It enables anyone who knows SQL to build production-grade data pipelines.

When referring to dbt, it can mean two slightly different things. dbt core is the open source framework that provides the SQL compiler and framework to manage your SQL workflow. You can interact with it via a command line interface. In addition, dbtlabs offers the fully managed SaaS product dbt Cloud. You can use it to handle all of your dbt projects from development to deployment in a single browser based tool. It provides useful features like a full IDE to develop and test code, orchestration, logging, and alerting. At the moment, dbt Cloud is not available for Trino users.

The framework allows you to check the quality of results, document the lineage, manage the changes/versions in the SQL scripts and orchestrate the queries, like a CI/CD framework but for your data. dbt is not an extract and load tool. The focus is on transforming what is already in your data warehouse/data lake.

Check out these links to learn more:

Goals of dbt and how that differs from Trino

Trino is the execution SQL engine and dbt is the framework to manage your SQL statements. dbt won’t execute the SQL itself, rather it pushes all of the compute down to the SQL engine. This SQL engine can be Trino, or an engine included in the data source like the database itself. Using Trino as the SQL execution engine allows you to use the same SQL dialect for all connected data sources. This includes data sources that natively do not support SQL like object storage systems, Kafka, Elasticsearch, and many others.

Transformation vs ad-hoc joins

Transformations done by dbt are in general used to clean and prepare data for analytics purposes. It’s often used to go from the raw data to a ready-to-use data for reporting and analysis. dbt creates database objects like tables or views to be consumed by business users and analytics tools.

On the other hand, even if Trino can also execute SQL to create tables and views, these SQL queries are not managed but just executed. Trino doesn’t have, like dbt, all the framework to version, audit, document and orchestrate SQL script and execution. Trino is more used to execute SQL SELECT statements generated by users or BI tools to analyze data in an interactive way.

Cases for why you need both

Trino and dbt are complementary when you need to access different sources from a single SQL query or when you need to run SQL query with good performance on object storage systems like S3, GCS, ADLS, or HDFS.

It’s where Trino can complement dbt, as dbt can only access a single data warehouse connection in a SQL query. In dbt there is no way to query multiple storage systems at the same time.

Trino is recognized for great performance with object storage/data lake processing. With dbt it can transform and prepare data at scale. Trino also allows you to run dbt on a traditional, on-premise data warehouse where normally dbt only runs on a modern cloud data warehouse like Snowflake, BigQuery, or Redshift.

dbt basics

dbtlabs offers a good tutorial which covers the fundamental topics of dbt for you to learn:

Project: A directory of SQL and YAML files defined with a single project file.
Models: A model is a single SQL file where you define your transformations to create a table or a view.
Profile: To define connections to your data sources.

Then you have other resources like seeds, macros, tests, sources, snapshots.

Demo: Querying Trino from a dbt project

Victor shows us a demo from his blog post that inspired this episode.

If you looked at the code, you may have noticed that the code used an adapter called db-presto-trino. This adapter derives from the outdated presto naming and is still there for interaction with legacy Presto clusters. Although it can work it uses an outdated python client to interact with Trino and there is an open issue to create an official dbt-trino adapter that uses the updated trino-python-client.

If you want to help with this, reach out on the issue itself and join the #db-presto-trino channel on the dbt Slack. https://community.getdbt.com/

After the show Marius Grama, started work on dbt-trino in his own repository. Thanks for the quick turnaround Marius!

PR of the week: PR 8283 Externalised destination table cache expiry duration for BigQuery Connector

The PR of the week, was committed by Ayush Bilala(Twitter), (LinkedIn), a Staff Software Engineer at Walmart Global Tech.

This fixes issue 8263 by adding a new configuration for the Big Query connector, bigquery.views-cache-ttl to allow configuring the cache expiration for BigQuery views.

Thanks Ayush!

Events, news, and various links

News

The “frog” book has been translated to Chinese! Keep your eyes peeled for the rebrand into Trino for the translation.

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

20: Trino for the Trinewbie

2021-06-23T00:00:00+00:00

Guests

Marius Grama, Data Engineer at willhaben internet service GmbH & Co KG (@findinpath)

Concept of the week: Trino for the Trinewbie

One of the best and easiest ways to get an understanding about Trino, and how to use it is the book Trino: Definitive Guide. The next three sections have a few excerpts from the book that does an incredible job at introducing the space Trino is in. If you would like to read the book in its entirety, Starburst offers the digital copy for free.

The Problems with Big Data

Everybody is capturing more and more data from device metrics, user behavior tracking, business transactions, location data, software and system testing procedures and workflows, and much more. The insights gained from understanding that data and working with it can make or break the success of any initiative, or even a company.

At the same time, the diversity of storage mechanisms available for data has exploded: relational databases, NoSQL databases, document databases, key-value stores, object storage systems, and so on. Many of them are necessary in today’s organizations, and it is no longer possible to use just one of them.

What is Trino?

Trino is not a database with storage, rather, it simply queries data where it lives. When using Trino, storage and compute are decoupled and can be scaled independently. Trino represents the compute layer, whereas the underlying data sources represent the storage layer.

This allows Trino to scale up and down its compute resources for query processing, based on analytics demand to access this data. There is no need to move your data, and provision compute and storage to the exact needs of the current queries, or change that regularly, based on your changing query needs.

Trino can scale the query power by scaling the compute cluster dynamically, and the data can be queried right where it lives in the data source. This characteristic allows you to greatly optimize your hardware resource needs and therefore reduce cost.

SQL-on-Anything

Trino was initially designed to query data from HDFS. And it can do that very efficiently, as you learn later. But that is not where it ends. On the contrary, Trino is a query engine that can query data from object storage, relational database management systems (RDBMSs), NoSQL databases, and other systems.

Trino queries data where it lives and does not require a migration of data to a single location. So Trino allows you to query data in HDFS and other distributed object storage systems. It allows you to query RDBMSs and other data sources. As such, it can really query data wherever it lives and therefore be a replacement to the traditional, expensive, and heavy extract, transform, and load (ETL) processes. Or at a minimum, it can help you with them and lighten the load. So Trino is clearly not just another SQL-on-Hadoop solution.

Object storage systems include Amazon Web Services (AWS) Simple Storage Service (S3), Microsoft Azure Blob Storage, Google Cloud Storage, and S3-compatible storage such as MinIO and Ceph. Trino can query traditional RDBMSs such as Microsoft SQL Server, PostgreSQL, MySQL, Oracle, Teradata, and Amazon Redshift. Trino can also query NoSQL systems such as Apache Cassandra, Apache Kafka, MongoDB, or Elasticsearch. Trino can query virtually anything and is truly a SQL-on-Anything system.

For users, this means that suddenly they no longer have to rely on specific query languages or tools to interact with the data in those specific systems. They can simply leverage Trino and their existing SQL skills and their well-understood analytics, dashboarding, and reporting tools. These tools, built on top of using SQL, allow analysis of those additional data sets, which are otherwise locked in separate systems. Users can even use Trino to query across different systems with the SQL they know.

Contributing to Trino

In this episode, Marius Grama discusses his journey with Trino. From joining the community, his first impressions and experiences, and what led him to make sixteen commits over the last three months. We also ask him where he thinks we could improve to make the onboarding experience better.

In the Trino project there are four roles. You can immediately become a participant or reviewer. To be a contributor, you need to follow some steps that are covered later in the episode. Likewise, for maintainers, there is a path to becoming a maintainer that is discussed in detail on the roles page.

Participants

Participants are those who show up and join in discussions about the project. Users, developers, and administrators can all be participants, as can literally anyone who has the time, energy, and passion to become involved. Participants suggest improvements and new features. They report bugs, regressions, performance issues, and so on. They work to make Trino better for everyone.

Contributors

Today’s episode covers the process that a contributor goes through to make a code change, but simply put:

A contributor submits code changes to Trino.

Reviewers

A reviewer reads a proposed change to Trino, and assesses how well the change aligns with the Trino vision and guidelines. This includes everything from high level project vision to low level code style. Everyone is invited and encouraged to review others’ contributions – you don’t need to be a maintainer for that.

Maintainers

A maintainer is responsible for checking in code only after ensuring it has been reviewed thoroughly and aligns with the Trino vision and guidelines. In addition to merging code, a maintainer actively participates in discussions and reviews. Being a maintainer does not grant additional rights in the project to make changes, set direction, or anything else that does not align with the direction of the project. Instead, a maintainer is expected to bring these to the project participants as needed to gain consensus. The maintainer role is for an individual, so if a maintainer changes employers, the role is retained. However, if a maintainer is no longer actively involved in the project, their maintainer status will be reviewed.

There is a writeup on the Apache Hive process to become a committer. For context, a committer is equivalent to a maintainer in Trino. This writeup aligns precisely with the Trino philosophy. Here are a few good quotes from that article:

Contributors often ask Hive PMC members the question, “What do I need to do in order to become a committer?” The simple (though frustrating) answer to this question is, “If you want to become a committer, behave like a committer.” If you follow this advice, then rest assured that the PMC will notice, and committership will seek you out rather than the other way around.

It should go without saying, but here it is anyway: your participation in the project should be a natural part of your work with Hive; if you find yourself undertaking tasks “so that you can become a committer”, then you’re doing it wrong, young padawan. This is particularly true if your motivations for wanting to become a committer are primarily negative or self-centered

PR of the week: PR 8135 Set default time zone for the current session

The PR of the week, was committed by today’s guest, Marius Grama.

This fixes issue 8112 by adding support for the SET TIME ZONE statement. The time zone specified is being stored as a session property and has a lower precedence than sql.forced-session-time-zone setting.

Thanks Marius!

Demo: Contributing to Trino

Here is the video that goes into detail on the steps below on how to contribute code to Trino!

Download an IDE.

First, you need to have an integrated development environment (IDE) to run the code. We recommend Intellij Community Edition as it is the standard that is used by developers across the project. Of course, you may use any IDE you like, but there may be issues that others may not be able to help with as readily.
Install Git.

Git is a distributed version source control software used to collaborate code with other users. You must install git in order to contribute to the project.
Install Docker.

The Trino testing framework runs Trino and other databases it connects to on Docker, a tool that runs different services in isolation using containers.
Go ahead and install Docker on your system.
Create and configure your GitHub account.

GitHub is a free hosted Git repository, and a central point of collaboration for the Trino project. If you haven’t done so, please create and configure your GitHub account.
Make a fork of the Trino repository on GitHub

Navigate to the Trino repository and click the “fork” button. Or you can just click it here: Fork.

You want to create a fork so that you can save your work without needing the special privileges it takes to commit code back to the Trino repository. This way, you can upload (also called a “push” in Git) your code to your fork and later open a pull request into the main Trino repository.
Clone your fork of the Trino repository to your computer and import into Intellij.

Execute the following clone command in your terminal:
```
 git clone git@github.com:<your_username>/trino.git
```
Open the Trino project in Intellij.
Add the Airlift code style checks to Intellij.

There are many unspoken rules to code style and formatting in any project. Trino is no exception. To make life simpler on the contributor and reviewer, the Trino code style definition that you can import into Intellij to have the Reformat Code action to format in the desired style of the project.
Build the project.

One of the greatest resources in trino history is this cheat sheet created by Piotr Findeisen. I use it for some of the commands, but the most important use, is the “fast” build command he adds on the top. In your terminal, make sure you are located in the root directory of the Trino project, and run the following command.
```
./mvnw -pl '!:trino-server-rpm,!:trino-docs,!:trino-proxy,!:trino-verifier,!:trino-benchto-benchmarks' clean install \
-TC2 -nsu \
-DskipTests \
-Dmaven.javadoc.skip=true \
-Dmaven.source.skip=true \
-Dair.check.skip-all=true
```
This builds all necessary modules of the project to run almost everything in Trino. The build excludes some modules, runs the compiler on multiple threads, skips the tests, javadocs, and the Airlift code style checks. If you would like to run code style check on a specific module (e.g. trino-elasticsearch) then you can run the following command.
```
./mvnw -pl ':trino-elasticsearch' clean install \
-TC2 -nsu \
-DskipTests \
-Dmaven.javadoc.skip=true \
-Dmaven.source.skip=true 
```
Sign the CLA.

Sign the contributor license agreement (CLA) to agree that all of your code you commit to the project is subject to the Apache License 2.0. Once you sign the agreement, scan and submit the form to cla@trino.io. This email gets checked every few days, and you can check if your name has been added to the contributors list.
At this point you can look for an issue labeled “good first issue” This identifies issues that we think are more approachable for developers that aren’t as familiar with the Trino repository yet.
One final thing before you move on to the contribution process. Before you start jumping in and changing the code, you’ll also want to create a special branch for your changes. A branch in git makes a separate workflow for all the changes you make to be isolated, If something goes wrong, or you need to compare with an older branch you can do so. The default branch may either be named master or main. See more on branching in git.

To make a branch for your feature, you can run the following command:

git checkout -b my-feature-branch

Follow the remaining steps in the contribution process page.

Question of the week: How do I remove nulls from an array in Trino?

A question posted to StackOverflow asked the following question:

I’m extracting data from a json column in Trino and getting the output in an array like this ['AL', NULL, 'NEW']. The problem is I need to remove the null since the array has to be mapped another array.I tried several options but no luck. How can I remove the null and get only ['AL', 'NEW'] without unnesting?

Piotr Findeisen replied:

You can use filter() for this:

trino> SELECT filter(ARRAY['AL', NULL,'NEW'], e -> e IS NOT NULL);
   _col0
-----------
 [AL, NEW]
(1 row)

Events, news, and various links

News

The “frog” book has been translated to Chinese! Keep your eyes peeled for the rebrand into Trino for the translation.

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

19: Data Ingestion to Iceberg and Trino

2021-06-10T00:00:00+00:00

Guests

Cory Darby, Principal Software Developer at BlueCat (@ckdarby)

Release 358

Martin:

SHOW STATS support for arbitrary queries.
Performance improvements for ORDER BY ... LIMIT queries on sorted data.
Support for Hive views containing LATERAL VIEW.

Manfred:

Reduced graceful shutdown time
A bunch of performance and correctness fixes
Removed support for legacy JDBC string in driver jdbc:presto:

More info at https://trino.io/docs/current/release/release-358.html.

Release 357

Martin:

Support for subquery expressions that produce multiple columns.
Support for CURRENT_CATALOG and CURRENT_SCHEMA.
Aggregation pushdown for ClickHouse connector.
Rule support for identifier mapping in various connectors.
New format_number function.
Cast row types as JSON objects.

Manfred:

Print dynamic filters summary in EXPLAIN ANALYZE
Fix trusted cert usage for OAuth
clear command in CLI
Numerous smaller connector changes - check your favourite connector

More at https://trino.io/docs/current/release/release-357.html

Concept of the week: Ingesting into Iceberg with Pulsar and Flink at BlueCat

Here are Cory’s slides that you can use to follow along while listening to the podcast.

PR of the week: PR 1905 Add format_number function

The PR of the week, is a simple but always useful PR done by maintainer Yuya Ebihara. This fixes issue 1878 that makes a nice format for very large numbers that get returned from the query to be truncated with a value suffix like (B - billion, M - million, K - thousand, etc…). Rather than reuse the CLI’s FormatUtils class, which missed various cases, he created his own implementation that solves for those issues. Thanks Yuya!

Demo: Showing the format_number functionality

Here are the examples we ran in the show.

SELECT format_number(DOUBLE '1234.5');

SELECT format_number(DOUBLE '-9223372036854775808');

SELECT format_number(DOUBLE '9223372036854775807');

SELECT format_number(REAL '-999');

SELECT format_number(REAL '999');

SELECT format_number(DECIMAL '-1000');

SELECT format_number(DECIMAL '1000');

SELECT format_number(999999999);

SELECT format_number(1000000000);

Question of the week: How do I search nested objects in Elasticsearch from Trino?

A question posted to StackOverflow asked how to search nested objects using the Elasticsearch connector.

Trino maps a nested object type to a ROW the same way that it maps a standard object type during a read. The nested designation itself serves no purpose to Trino since it only determines how the object is stored in Elasticsearch.

Check out Brian’s full answer to this question.

Events, news, and various links

News

The “frog” book has been translated to Chinese! Keep your eyes peeled for the rebrand into Trino for the translation.

Blogs

Videos

Trino Meetup: Apache Iceberg: A table format for data lakes with unforeseen use cases

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

18: Trino enjoying the view

2021-05-20T00:00:00+00:00

Commander Bun Bun enjoying the views...

Guests

Anjali Norwood, Senior Open Source Software Engineer at Netflix (@AnjaliNorwood)

Concept of the week: Trino Views, Hive Views and Materialized Views

Before diving into views, it can be helpful to take a step back to consider a well understood abstraction, like tables, to understand the purpose of a view. Tables contain data in a vertical orientation, referred to as columns. Databases represent instances of the data in a horizontal orientation, referred to as rows. See the following tables, customer and orders tables from the TPCH dataset.

customer table

custkey	name	nationkey	acctbal	mktsegment
376	Customer#000000376	16	4231.45	AUTOMOBILE
377	Customer#000000377	23	1043.72	MACHINERY
378	Customer#000000378	22	5718.05	BUILDING

orders table

orderkey	custkey	orderstatus	totalprice	orderdate	orderpriority
1	376	O	172799.49	1996-01-02	5-LOW
2	376	O	38426.09	1996-12-01	1-URGENT
3	377	F	205654.3	1993-10-14	5-LOW

The columns have a schema that enforce particular data types in particular columns and prevents insertion of invalid data into the table by throwing an exception. This becomes extremely useful when reading and processing the data as there are a clear set of operations that can run on certain columns ased on their type. This information is also useful when deserializing result sets into various in-memory abstractions. Here is an example of the customer table schema:

customer table schema

CREATE TABLE customer (
   custkey bigint,
   name varchar(25),
   address varchar(40),
   nationkey bigint,
   phone varchar(15),
   acctbal double,
   mktsegment varchar(10),
   comment varchar(117)
)

Views and materialized views:

The structure of a view is similar to tables in that they have columns, rows, and schemas similar to regular database tables. What then do views offer over tables? Views offer ways to encapsulate complex SQL statements. For example, take this SQL query that would run over the customer and orders tables defined before.

SELECT 
 c.custkey, 
 name, 
 nationkey, 
 mktsegment, 
 sumtotalprice, 
 openstatuscount, 
 failedstatuscount, 
 partialstatuscount
FROM 
 customer c 
 JOIN (
  SELECT 
   custkey, 
   SUM(totalprice) AS sumtotalprice, 
   COUNT_IF(orderstatus = 'O') AS openstatuscount,
   COUNT_IF(orderstatus = 'F') AS failedstatuscount, 
   COUNT_IF(orderstatus = 'P') AS partialstatuscount
  FROM orders
  GROUP BY custkey
 ) o
 ON c.custkey = o.custkey;

This query performs some aggregations on the orders table grouped by customer. Then there is a join performed on the aggregated orders table and customer table by custkey.

custkey	name	nationkey	mktsegment	sumtotalprice	openstatuscount	failedstatuscount	partialstatuscount
376	Customer#000000376	16	AUTOMOBILE	1600696.4700000002	3	6	1
377	Customer#000000377	23	MACHINERY	803271.9400000001	3	6	0
379	Customer#000000379	7	AUTOMOBILE	3155009.54	7	11	0

From here, there are many ways you could further evaluate the resulting data. You could filter and look at which market segment is spending the most on your products. You could also look at where there are the most failed orders by the nation column to evaluate where shipping lines may need to be improved. The table above which results from the example query, is a good intermediate state of the data that can be reused for many future evaluations. Instead of defining a new table, you can create a view on this data that encapsulates the complex SQL that was used to calculate it. This is done using the CREATE VIEW statement.

CREATE VIEW customer_orders_view AS 
<complex SQL query above>

Now, when you want to run any further analysis on this intermediate dataset, you simply refer to the view instead of having to rewrite the statement before. As mentioned, this view also has a schema and is treated much like a table when the query engine does its planning. In this way it is also easier to map the data to the application logic by enabling different shapes of the same data. It should be made clear that these views are read-only and do not allow inserts, updates, or deleting from the view.

Another reason why you would want to create a view is to control read access to the data. When running the query, you get to choose which columns and rows get filtered out and that return from when users query the view. The authorization of a user is tied to the view and its content, and that can significantly differ from the complete data in the underlying tables. For example, the views can exclude sensitive data like social security numbers, birth dates, credit card numbers, and many other facts.

When creating a view, there are two modes that the view can run in that will indicate the user that will run the queries defined in the view during query runtime. You can either run this query as the DEFINER which indicates to run the view query as the user that created the view, or as the INVOKER, which indicates to run the view query as the user that is running the outer query of the view. The default mode is DEFINER. See more in the security section of the create view documentation.

There are two types of views; materialized and logical views. The view defined above is the standard logical view that gets expanded into its definition. Logical views do not provide any performance benefit since the data is not stored and instead queried at query time. Materialized views persist the view data upon view creation by storing the query data.

Materialized views make overall queries much faster to run as part of the query has already been computed. One issue with materialized views is that the data may become outdated and out of sync with the underlying table data. To keep the data between the tables and materialized view in sync, you have to refresh the view. A special refresh command REFRESH MATERIALIZED VIEW is called periodically to handle this operation, or to schedule the procedure run automatically.

Trino views: So many views, so little time

Views handling in Trino depends on the connector. In general, most connectors expose views to Trino as if they are another set of tables available for Trino to query. The main exceptions for this is the Hive and Iceberg connectors. The table below lists the current possible Hive and Iceberg views.

		Logical	Materialized
Trino Created View	Hive Connector	✅	❌
Trino Created View	Iceberg Connector	✅ (Edit: PR 8540)	✅
Hive Created View		✅ (read-only)	✅ (read-only)

You’ll notice that the materialized views cannot be created through the Hive connector in Trino. You will get the following exception:

Caused by: java.sql.SQLException: Query failed (#...): 
This connector does not support creating materialized views.

Also, you cannot create logical views in Iceberg and you will get the following exception:

Caused by: java.sql.SQLException: Query failed (#...): 
This connector does not support creating views.

Trino reads Hive views

Before Trino there was Hive. Trino is a replacement for the Hive runtime for many users, and it is very useful for these users to also be able to read data from Hive views in Trino. Trino always aims to be compatible with as many Hive abstractions as possible to make migrating away from Hive to Trino as painless as possible. So Trino supports reading data from Hive Views, though it doesn’t support updates on these views. You have to update these views through Hive and ideally you will gradually migrate these views to Trino over time. Trino also supports reading Hive materialized views, though Trino reads these views as another Hive table rather since they are stored similarly to standard Hive tables. Since Hive views are defined in HiveQL, the view definitions need to be translated to Trino SQL syntax. This is done using LinkedIn’s Coral library.

Coral: the unifier of the bee and the bunny

Coral is a project that allows for translation between views from different SQL syntax. It can process Hive QL statements and convert them to an internal representation using Apache Calcite. It then converts the internal representation to Trino SQL.

Trino reading Hive view sequence diagrams

In both of these sequence diagrams, notice that the first actions are to create a Hive view. This is created and maintained by the Hive system and it is impossible to create or update a similar view in Trino.

This diagram shows the creation of a Hive view, then shows the sequence of events when Trino reads that view.

This diagram shows the creation of a Hive materialized view, then shows the sequence of events when Trino reads the materialized view.

Trino native view sequence diagrams

This diagram shows the sequence diagram for a Trino view that is created using the Hive Connector.

This diagram shows the sequence diagram for a materialized Trino view that is created using the Iceberg Connector.

Iceberg materialized view refresh (currently only full refresh in Iceberg connector)

Ideally, as the tables underlying a materialized view change, the materialized view should be automatically and incrementally updated to reflect the results that are in sync with latest data.

Automatically keeping materialized views fresh can be tricky from resource management point of view since the computation to materialize the materialized view can be expensive. Trino currently does not support automatic refresh of materialized views. It instead supports the REFRESH MATERIALIZED VIEW command that the user can issue to ensure that the materialized view is fresh.

As a part of executing REFRESH MATERIALIZED VIEW command in Trino, existing data in the materialized view is dropped and new data is inserted if there are any changes to base data. If the base data has not changed at all, the REFRESH MATERIALIZED VIEW command is a no-op.

What happens if the user issues a query against the materialized view, and the materialized view is not fresh? Trino detects that the materialized view is stale, so it expands the materialized view definition, much like a logical view and executes that SQL statement. Trino runs the query against the base tables.

Incremental or delta refresh of materialized views is a more efficient way of keeping the materialized view in sync with the base data. An incremental refresh means only parts of the data that need to be updated in a materialized view are updated The rest of the data is left untouched. For example, say you have a base table, sales, partitioned on date column. The sales table only gets inserted data for that day. If the materialized view is also partitioned on date, a new partition for a day can be added and data inserted for that day. Data for previous days/months is still fresh and can be left untouched. This is something on Netflix’s roadmap. The incremental refresh of the materialized view can be a partition level refresh, another can be a more granular row-level refresh by using functionality similar to SQL MERGE statement.

Support in Trino and at Netflix:

Netflix materialized views

The main reason Netflix is interested in materialized views is to give analysts an easy way to compute and materialize their frequently used queries and keep the results refreshed without relying on ETL pipeline to create and maintain those result sets. Some materialized views are as simple as queries that project columns and apply filters, selecting data for a time range or for a test-id. Others are more complex that perform multi-level joins and aggregations.

Netflix materialized view cross compatibility extension

Materialized views, much like logical views, are compatible across Trino and Spark, the two main engines used at Netflix. Spark is used at Netflix to do ETL, and creating and populating tables. Trino is the most popular engine with analysts and developers for adhoc and experimental queries as well as audits.

Trino is also used for CREATE TABLE AS SELECT (CTAS) in some use cases. Both the engines access data from tables using Iceberg and Hive connectors where data is stored in S3. Netflix built upon the Trino logical views to create common views that are accessible from both Spark and Trino. The difference between the Trino logical views and Netflix common views is that the metadata is stored in the Hive metastore for Trino logical views, while common views store their metadata in JSON format in S3.

A view object in Hive metastore points to the S3 location of metadata. It tracks evolution of view definition in the form of versions so that you can potentially revert a view to its older version. Main benefit of common views is interoperability between Spark and Trino (can create, replace, query, drop from either engine and can be expanded to other engines). Netflix supports common views through both Hive and Iceberg connectors.

Currently, common views support SQL syntax common to both Spark and Trino. This support can be expanded in future using LinkedIn’s Coral project such that engine specific syntax and semantics can be translated and interpreted by another engine. Netflix materialized views are an extension of Trino materialized views to make them inter-operable between Spark and Trino. The only difference between Trino and Netflix materialized views is where the metadata is stored, very similar to Trino and Netflix logical views.

Roadmap:

Netflix is looking into caching query results using materialized views and memory connector.
Incremental refresh ideas.

PR of the week: PR 4832 Add Iceberg support for materialized views

Our guest, Anjali, is the author of this weeks PR of the week, which adds Iceberg support for materialized views. Thanks Anjali!

Honorable PR mentions:

In order for the PR of the week to work, Anjali added syntax support for Trino materialized views with commands: CREATE MATERIALIZED VIEW, REFRESH MATERIALIZED VIEW, and DROP MATERIALIZED VIEW.

Before any of this was done, user laurachenyu integrated Coral with trino to enable querying hive views.

Demo: Showing the different views in Trino

In Trino, create some Hive tables in a hive catalog named hdfs that represents the underlying storage Trino writes to.

CREATE SCHEMA hdfs.tiny
WITH (location = '/tiny/');

CREATE TABLE hdfs.tiny.customer
WITH (
  format = 'ORC',
  external_location = '/tiny/customer/'
) 
AS SELECT * FROM tpch.tiny.customer;

CREATE TABLE hdfs.tiny.orders
WITH (
  format = 'ORC',
  external_location = '/tiny/orders/'
) 
AS SELECT * FROM tpch.tiny.orders;

Now, create a logical Hive view (hive_view), and a materialized Hive view (hive_materialized_view) from the Hive CLI.

USE tiny;

CREATE VIEW hive_view AS 
SELECT c.custkey, c.name, nationkey, mktsegment, orderstatus, totalprice, orderpriority, orderdate 
FROM customer c JOIN orders o ON c.custkey = o.custkey;

CREATE MATERIALIZED VIEW hive_materialized_view AS
SELECT c.custkey, c.name, nationkey, mktsegment, orderstatus, totalprice, orderpriority, orderdate 
FROM customer c JOIN orders o ON c.custkey = o.custkey;

As you create the views, you should check the state in the hive metastore.

SELECT t.TBL_NAME, t.TBL_TYPE, t.VIEW_EXPANDED_TEXT, t.VIEW_ORIGINAL_TEXT 
FROM DBS d
 JOIN TBLS t ON d.DB_ID = t.DB_ID
WHERE d.NAME = 'tiny';

Once the Hive views exist, you can then query them from Trino.

CREATE VIEW hdfs.tiny.trino_view AS 
SELECT c.custkey, c.name, nationkey, mktsegment, orderstatus, totalprice, orderpriority, orderdate 
FROM hdfs.tiny.customer c JOIN hdfs.tiny.orders o ON c.custkey = o.custkey;

/* Fails: Caused by: java.sql.SQLException: Query failed (#20210516_032433_00002_6syuw): 
This connector does not support creating materialized views */
CREATE MATERIALIZED VIEW hdfs.tiny.trino_materialized_view AS 
SELECT c.custkey, c.name, nationkey, mktsegment, orderstatus, totalprice, orderpriority, orderdate 
FROM hdfs.tiny.customer c JOIN hdfs.tiny.orders o ON c.custkey = o.custkey;

/* Fails: Caused by: java.sql.SQLException: Query failed (#20210516_101856_00009_ihjur): 
This connector does not support creating views */
CREATE VIEW iceberg.tiny.iceberg_view AS 
SELECT c.custkey, c.name, nationkey, mktsegment, orderstatus, totalprice, orderpriority, orderdate 
FROM hdfs.tiny.customer c JOIN hdfs.tiny.orders o ON c.custkey = o.custkey;

CREATE MATERIALIZED VIEW iceberg.tiny.iceberg_materialized_view AS 
SELECT c.custkey, c.name, nationkey, mktsegment, orderstatus, totalprice, orderpriority, orderdate 
FROM hdfs.tiny.customer c JOIN hdfs.tiny.orders o ON c.custkey = o.custkey;

/* 
This REFRESH call failed during the show due to the fact that I created the 
materialized Trino view in the Iceberg (`iceberg`) catalog using tables from the
Hive(`hdfs`) catalog. I should have created the materialized view using the
iceberg catalog:

CREATE MATERIALIZED VIEW iceberg.tiny.iceberg_materialized_view AS 
SELECT c.custkey, c.name, nationkey, mktsegment, orderstatus, totalprice, orderpriority, orderdate 
FROM iceberg.tiny.customer c JOIN iceberg.tiny.orders o ON c.custkey = o.custkey;
*/
REFRESH MATERIALIZED VIEW iceberg.tiny.iceberg_materialized_view;

/* query tables */

SELECT * FROM hdfs.tiny.customer LIMIT 3;

SELECT * FROM hdfs.tiny.orders LIMIT 3;

/* query views */

SELECT * FROM hdfs.tiny.trino_view LIMIT 3;

SELECT * FROM hdfs.tiny.hive_view LIMIT 3;

SELECT * FROM hdfs.tiny.hive_materialized_view LIMIT 3;

SELECT * FROM iceberg.tiny.iceberg_materialized_view LIMIT 3;

Question of the week: Are JDBC drivers backwards compatible with older Trino versions?

Full question: Are JDBC drivers backwards compatible with older Trino versions? I’m trying to install the 354 driver on a multi-tenanted Tableau server where there might be older Trino versions in play. Do I need to upgrade my Trino clients right away when upgrading my server to Trino version from <=350 to >350?

For this particular users case, the answer is that they won’t need to upgrade their clients assuming they are on Trino servers. If their server versions are PrestoSQL version <= 350 then they will need to hold off on upgrading to a Trino client.

Trino’s JDBC drivers typically maintain compatibility with older server versions (and vice versa). However, the project was renamed from PrestoSQL to Trino starting version 351, and as a consequence, JDBC drivers with version >= 351 are not compatible with servers with version <= 350. More details at: https://trino.io/blog/2021/01/04/migrating-from-prestosql-to-trino.html.

In short, you can have a PrestoSQL client with a Trino server, but you can’t have a Trino client with an PrestoSQL server.

Events, news, and various links

Events

Join for an awesome event on May 26th as Iceberg Creator, Ryan Blue, dives into some interesting and less conventional use cases of Apache Iceberg. Trino Americas meetup

Blogs

https://engineering.linkedin.com/blog/2020/coral

Videos

https://www.arcadiadata.com/lp/tech-talk-on-join-optimization/

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

17: Trino connector resurfaces API calls

2021-05-13T00:00:00+00:00

Commander Bun Bun is diving deep to find anomalies!

Resurface links

Guests

Rob Dickinson, Co-founder and CEO of Resurface (@robfromboulder)
Martin Traverso, creator of Trino/Presto, and CTO at Starburst (@mtraverso)

Concept of the week: Resurface and the Resurface connector

What is Resurface?

Resurface is an API system of record, which is a fancy way of saying that Resurface is a purpose-built database for API requests and responses. Like a weblog or access log, but on steroids because Resurface runs on Trino.

Why do you need a system of record for your APIs? Because otherwise you’re guessing about how your APIs are used and attacked, and guessing doesn’t feel good. Resurface helps your DevOps and security teams instantly find API failures, slowdowns, and attacks – easily, responsibly, and at scale.

How Resurface differs from logs & metrics

You probably use system monitoring tools, which tell you about what’s happening on your systems. What code is running, what code is slow, and what error codes are returned. That’s all great — but it still leaves a big gap between the system-level events you can see, and what your API consumers actually see.

Resurface helps you fill this gap with your own API system of record. Now your customers, your DevOps team, and your security team all have the same view of every transaction, because there is a record of the requests and responses.

The other obvious way to compare Resurface against other tools is to look at the data model. System monitoring gives you time-series metrics, or timestamped log messages with a severity and detail string. Resurface gives you all the request and response data fields, including headers and payloads, in a schema where all of those fields are discrete and searchable. Plus it adds a bunch of helpful virtual and computed columns.

The indexing Problem

Resurface has a very descriptive data model, but there’s a problem here – how to partition and index this data for efficient searching. Partitioning based on time is the obvious starting point, but within a time range, what then? Index everything?

Most databases work best when a subset of the columns are constrained at once – but in their case, they have strong reasons for wanting to use all columns at once. A system monitoring tool might give you a count of “500 codes” – but they want to detect silent failures, like malformed JSON payloads or airline tickets selling for less than twenty dollars. That means looking at the URL, content type, other headers, and payloads, all at the same time.

They also want to classify kinds of API consumers by their behaviors – are they using or attacking your API? To classify those behaviors. Again, they look at the URL, content type, payloads. If they can query for the yellow region below, they find lost revenue that they can recover.

Now you might be thinking – maybe the best solution is to do all this processing when the API calls are captured, but then how would you identify a new zero-day failure or attack? The definition of “responses failed” and “threats” needs to be changeable without having to reprocess any data, which really favors query-time processing.

The example below is pretty much as simple as this gets. I struggled to find one of these queries that actually fits in a reasonable amount of space.

So how to build a database that does these kinds of queries in reasonable time?

The Resurface connector

The first prototype actually used the Trino memory connector, which gave them the kind of query performance that they were looking for, but wasn’t shippable (for obvious reasons).

Then they tried Redis as a replacement in-memory db, but the problem is that the queries are gonna pull all the data in Redis over the network for every query. Not cool.

Trino allows you to move the queries closer to the data, and so that’s what they did. They took inspiration from the “local file” connector, where the connector reads directly from the filesystem instead of over the network.

Then the question was, what file format to use? They tried JSON, CSV, Protocol Buffers, and ultimately found the fastest and simplest approach was just to write a simple binary file format that requires no real parsing. When these files fit in memory, their connector can process SQL queries at 4GB/sec per core. The connector was easy to write because they’re just mapping between fields in the binary files and the columns exposed to Trino. They built the first version of their connector in a weekend!

Why not just use Avro?

Simple requirements – basic versioning, no secondary objects, limited data types
Zero-allocation reader for fast linear scan – one memcpy per physical column
Connector can report null/not-null without type conversion
Connector defers type conversion until getXXX() method
getSlice() just wraps an existing buffer (zero allocation)

Most of these optimizations were realized by working backwards from the Trino connector API to get the best linear scan performance imaginable.

Combining API calls with other data

Now they can deliver API call data out to all the different kinds of SQL clients out there, and they’re also able to combine API call data with data stored in other databases.

This is really exciting because your Resurface database plays nicely with all your other databases that are bridged together with Trino. That means that actual API traffic can be brought into your customer data mart, or combined with data from any other systems, in real time!

PR of the week: PR 4022 Add Soundex function

A big shoutout to tooptoop4 for their contribution to this weeks PR of the week.

This PR adds the soundex() function, which is a phonetic function. These functions show up in the WHERE clause of a query to find words that sound similar. There’s a few examples in the demo below.

Thanks for this awesome contribution!

Demo: Using the soundex function

SELECT * 
FROM (
  VALUES 
  (1, 'Bri'), 
  (2, 'Bree'), 
  (3, 'Bryan'), 
  (4, 'Brian'), 
  (5, 'Briann'), 
  (6, 'Brianna'), 
  (7, 'Briannas'),
  (8, 'Bri Jan'),  
  (9, 'Bri Yan'),  
  (10, 'Bob')
) names(id, name)
WHERE soundex(name) = soundex('Brian');

# Results:
# |id |name   |
# |---|-------|
# |3  |Bryan  |
# |4  |Brian  |
# |5  |Briann |
# |6  |Brianna|
# |9  |Bri Yan|

SELECT * 
FROM (
  VALUES 
  (1, 'Man'), 
  (2, 'Fred'), 
  (3, 'Manfred'), 
  (4, 'Can fed'), 
  (5, 'Tan bed'), 
  (6, 'Man Fred'), 
  (7, 'Man dread'), 
  (8, 'Bob')
) names(id, name)
WHERE soundex(name) = soundex('Manfred');

# Results:
# |id |name    |
# |---|--------|
# |3  |Manfred |
# |6  |Man Fred|

SELECT * 
FROM (
  VALUES 
  (1, 'Martin'), 
  (2, 'Mar teen'), 
  (3, 'Mar tin'), 
  (4, 'Marteen'), 
  (5, 'Mart in')
) names(id, name)
WHERE soundex(name) = soundex('Martin');

# Results:
# |id |name    |
# |---|--------|
# |1  |Martin  |
# |2  |Mar teen|
# |3  |Mar tin |
# |4  |Marteen |
# |5  |Mart in |

SELECT * 
FROM (
  VALUES 
  (1, 'Robert'), 
  (2, 'Rob'), 
  (3, 'Bob'), 
  (4, 'Bobert'), 
  (5, 'Bobby')
) names(id, name)
WHERE soundex(name) = soundex('Rob');

# Results:
# |id |name|
# |---|----|
# |2  |Rob |


SELECT * 
FROM (
  VALUES 
  (1, 'Christ'), 
  (2, 'Christeen'), 
  (3, 'Christian'), 
  (4, 'Christine'), 
  (5, 'Chris'), 
  (6, 'Kristine')
) names(id, name)
WHERE soundex(name) = soundex('Christine');

# Results:
# |id |name     |
# |---|---------|
# |1  |Christ   |
# |2  |Christeen|
# |3  |Christian|
# |4  |Christine|

# What the results actually return

SELECT name, soundex(name)
FROM (
  VALUES 
  (1, 'Christ'), 
  (2, 'Christeen'), 
  (3, 'Christian'), 
  (4, 'Christine'), 
  (5, 'Chris'), 
  (6, 'Kristine'), 
  (6, 'Christine')
) names(id, name);

# Results:
# |name     |_col1|
# |---------|-----|
# |Christ   |C623 |
# |Christeen|C623 |
# |Christian|C623 |
# |Christine|C623 |
# |Chris    |C620 |
# |Kristine |K623 |

Question of the week: How to export query results into a file (e.g. CTAS, but into a single file)?

This is possible using the Trino CLI’s --execute option in conjunction with the redirect operator (>). You may also use other options, such as, --output-format to specify the format of the data going to the file (e.g. if you want a csv, tsv, json, headers, etc…)

Output format for batch mode [ALIGNED, VERTICAL, TSV, TSV_HEADER, CSV, CSV_HEADER, CSV_UNQUOTED, CSV_HEADER_UNQUOTED, JSON, NULL] (default: CSV)

Here is an example of the command you would run using the cli executable trino.

trino --execute "select * from tpch.sf1.customer limit 5" \
--server http://localhost:8080 \
--output-format CSV_HEADER > customer.csv

If you’re running Trino in Docker, here is an example command to run this in a temporary Trino container.

docker run --rm -ti \
    --network=trino-hdfs3_trino-network \
    --name export-trino-data \
    trinodb/trino:latest \
    trino --execute "select * from tpch.sf1.customer limit 5" \
    --server http://trino-coordinator:8080 \
    --output-format CSV_HEADER > customer.csv

If you have a very complex query that takes up multiple lines, or you don’t want to spend half of your day escaping quotations, you can put your SQL into a file and reference the query using the -f or --file options. The query above could be represented as this query:

trino --file query.sql \
--server http://localhost:8080 \
--output-format CSV_HEADER > customer.csv

This query along with the following query.sql file produces an equivalent query:

select * 
from tpch.sf1.customer 
limit 5;

Finally, one last trick is to stage the data using the memory connector to stage the data and finally export it. The Trino Definitive Guide has example for adding Iris data set into memory connector storage with CLI.

Events, news, and various links

Apache Iceberg: A table format for data lakes with unforeseen use cases
- Americas meetup
- May 26th, 2021 @ 5:30p EDT
- Link: https://www.meetup.com/trino-americas/events/278103777/
Trino Summit
- Hybrid event
- September 15th, 2021
- Link: http://starburst.io/trinosummit2021

Blogs

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Francisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

16: Make data fluid with Apache Druid

2021-04-29T00:00:00+00:00

Commander Bun Bun the speedy druid!

Druid links

Guests

Samarth Jain, Software Engineer at Netflix (@samarthjain11)
Parth Brahmbhatt, Senior Software Engineer at Netflix (@brahmbhattparth)
Rachel Pedreschi, VP Community and Developer Relations at Imply (@rachelpedreschi)

Release 356

Release notes discussed: https://trino.io/docs/current/release/release-356.html

General:
- MATCH_RECOGNIZE clause support, used to detect patterns in a set of rows within a single query
- soundex function
- Property to limit planning time (and improved behavior about cancel during planning)
- A bunch of performance improvements around pushdown (and start of docs for pushdowns)
- Misc improvements around materialized views support
JDBC driver - OAuth2 token caching in memory
BigQuery - create and drop schema
Hive - Parquet, ORC and Azure ADL improvements
Iceberg - SHOW TABLES even when tables created elsewhere
Kafka - SSL support
Metadata caching improvements for a bunch of connectors
SPI: couple of changes

Concept of the week: Apache Druid and realtime analytics

This week covers Apache Druid, a modern, real-time OLAP database. Joining us is the head of developer relations at Imply, the company that creates an enterprise version of Druid, to cover what Druid is, and the use cases it solves.

Here are the slides that Rachel uses in the show:

Druid Architecture

Druid has several process types:

Coordinator processes manage data availability on the cluster.
Overlord processes control the assignment of data ingestion workloads.
Broker processes handle queries from external clients.
Router processes are optional processes that can route requests to Brokers, Coordinators, and Overlords.
Historical processes store queryable data.
MiddleManager processes are responsible for ingesting data.

The Druid architecture.

Druid processes can be deployed any way you like, but for ease of deployment we suggest organizing them into three server types: Master, Query, and Data.

Master: Runs Coordinator and Overlord processes, manages data availability and ingestion.
Query: Runs Broker and optional Router processes, handles queries from external clients.
Data: Runs Historical and MiddleManager processes, executes ingestion workloads and stores all queryable data.

Source: https://druid.apache.org/docs/latest/design/architecture.html.

PR of the week: PR 3522 Add Druid connector

Our guest, Samarth, is the author of this weeks PR of the week. Puneet Jaiswal is the first engineer that started work to add a Druid connector. Later, Samarth picked up the torch and the Trino Druid connector became available in release 337.

An honorable mention goes to our other guest, Parth, for doing some preliminary work that enabled aggregation pushdown in the SPI. This enabled the use of the Druid connector to actually scale well with the completion of PR 4313 (see future work below).

A third honorable PR, that was completed by @findepi, was adding pushdown to the jdbc client which appeared in release 337 along with the Druid connector.

It is incredible to see the amount of hands that various features and connectors pass through to get to the final release.

Future work:

Demo: Using the Druid Web UI to create an ingestion spec querying via Trino

Let’s start up the Druid cluster along with the required Zookeeper and PostgreSQL instance. Clone this repository and navigate to the trino-druid directory.

git clone git@github.com:bitsondatadev/trino-getting-started.git

cd community_tutorials/druid/trino-druid

docker-compose up -d

To do batch insert, navigate to the Druid Web UI once it has finished starting up at http://localhost:8888. Once that is done, click the “Load data” button, choose, “Example data”, and follow the prompts to create the native batch ingestion spec. Once the spec is created, run the job and ingest the data. More information can be found here: https://druid.apache.org/docs/latest/tutorials/index.html

The Druid architecture.

Once Druid completes the task, open up a Trino connection and validate that the druid catalog exists.

docker exec -it trino-druid_trino-coordinator_1 trino

trino> SHOW CATALOGS;

 Catalog 
---------
 druid   
 system  
 tpcds   
 tpch    
(4 rows)

Now show the tables under the druid.druid schema.

trino> SHOW TABLES IN druid.druid;
   Table   
-----------
 wikipedia 
(1 row)

Run a SHOW CREATE TABLE to see the column definitions.

trino> SHOW CREATE TABLE druid.druid.wikipedia;
             Create Table             
--------------------------------------
 CREATE TABLE druid.druid.wikipedia ( 
    __time timestamp(3) NOT NULL,     
    added bigint NOT NULL,            
    channel varchar,                  
    cityname varchar,                 
    comment varchar,                  
    commentlength bigint NOT NULL,    
    countryisocode varchar,           
    countryname varchar,              
    deleted bigint NOT NULL,          
    delta bigint NOT NULL,            
    deltabucket bigint NOT NULL,      
    diffurl varchar,                  
    flags varchar,                    
    isanonymous varchar,              
    isminor varchar,                  
    isnew varchar,                    
    isrobot varchar,                  
    isunpatrolled varchar,            
    metrocode varchar,                
    namespace varchar,                
    page varchar,                     
    regionisocode varchar,            
    regionname varchar,               
    user varchar                      
 )                                    
(1 row)

Finally, query the first 5 rows of data showing the user and how much they added.

trino> SELECT user, added FROM druid.druid.wikipedia LIMIT 5;
      user       | added 
-----------------+-------
 Lsjbot          |    31 
 ワーナー成増    |   125 
 181.230.118.178 |     2 
 JasonAQuest     |     0 
 Kolega2357      |     0 
(5 rows)

Question of the week: Why doesn’t the Druid connector use the native json over http calls?

To answer this question I’m going to quote Samarth and Parth on this from this super long but enlightening thread on the subject.

Samarth’s take:

Pro JDBC:

Going forward, Druid SQL is going to be the de-facto way of accessing Druid data with native JSON queries being more of an advanced level use case. A benefit of down the SQL route is that we can take advantage of all the changes made in the Druid SQL optimizer land like using vectorized query processing when possible, when to use a TopN vs group by query type, etc. If we were to hit historicals directly, which don’t support SQL querying, we potentially won’t be taking advantages of such optimizations unless we keep porting/applying them to the trino-druid connector which may not always be possible.
If we end up letting a Trino node act as a Druid broker (which is what would happen I assume when you let a Trino node do the final merging), then, you would need to allocate similar kinds of resources (direct memory buffers, etc.) to all the Trino worker nodes as a Druid broker which may not be ideal.
This is not necessarily a limitation but adds complexity - with your proposed implementation, the Trino cluster will need to maintain state about what Druid segments are hosted on what data nodes (middle managers and historicals). The Druid broker already maintains that state and having to replicate and store all that state on the Trino coordinator will demand more resources out of it.
To your point on SCAN query overwhelming the broker - that shouldn’t be the case as Druid scan query type streams results through broker instead of materializing all of them in memory. See: https://druid.apache.org/docs/latest/querying/scan-query.html

Pro HTTP:

One use case where directly hitting the historicals may help is when the group by key space is large (like a group by on UUID like column). For a very large data set, a Druid broker can get overwhelmed when performing the giant merge. By hitting historicals directly, we can let historicals do first level merge followed by multiple Trino workers doing the second level merge. I am not sure if solving for this limited use case is worth going the http native query route, though. IMHO, Druid generally isn’t built for pulling lots of data out of it. You can do it, but whether you want to push that work down to Druid cluster or let Trino directly pull it down for you is debatable.

I would advocate for going the Druid SQL route at least for the initial version of the connector. This would provide a solution for the majority of the use cases that Druid generally is used for (OLAP style queries over pre-aggregated data). We could in the next version of the connector, possibly focus on adding a new mode of the connector which can make native JSON queries directly to the Druid historicals and middle managers instead of submitting SQL queries to the broker.

Parth’s take:

Our general take is that Druid is designed as OLAP cube and so it is really fast when it comes to aggregate queries over reasonable cardinality dimensions and will not work well for use cases that are treating it like a regular data warehouse and trying to do pure select scans with filter. The primary reason most of our users would look to Trino’s Druid connector is:

To be able to join already aggregated data in Druid to some other datastore in our warehouse.
To gain access through tooling that doesn’t have good support for Druid inherently for dashboarding use cases (think Tableau).

Even if we wanted to support the use cases that Druid is not designed for in a more efficient manner by going thorough historicals directly, it has other implications. We are now talking about partial aggregation pushdown which is more complicated IMO than our current approach of complete pushdown. We could choose to take the approach that others have taken where we can incrementally add a mode to Druid connector to either use JDBC or go directly to historical, but I really don’t think it’s a good idea to block the current development in hopes of a more efficient future version specially when this is just implementation detail that we can switch anytime without breaking any user queries.

Events, news, and various links

Trino Summit: http://starburst.io/trinosummit2021

Blogs

Videos

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

15: Iceberg right ahead!

2021-04-15T00:00:00+00:00

Looks like Commander Bun Bun is safe on this Iceberg
https://joshdata.me/iceberger.html

Iceberg links

Guests

Ryan Blue, creator of Iceberg, and Senior Software Engineer at Netflix (@rdblue)
David Phillips, creator of Trino/Presto, and CTO at Starburst (@electrum32)

Release 355

Release notes discussed: https://trino.io/docs/current/release/release-355.html

Martin’s list:

Multiple password authentication plugins
Column and table lineage reporting in query events
Improved planning performance for queries against Phoenix or SQL Server
Improved performance for ORDER BY … LIMIT queries against Phoenix

Manfred’s notes:

Security overview and TLS pages and authentication types
Reiterate multiple authentication providers (ldap1, ldap2, password)
Improved parallelism for table bucket count is small compared to number of nodes.
Include information about Spill to disk in EXPLAIN ANALYZE
Unixtime function changes
Hive view support improvements

Concept of the week: Apache Iceberg and the Iceberg spec

Interview with Ryan Blue

In the previous episode, we covered the differences between the Iceberg table format, and the Hive table format from a technical standpoint in the context of Trino. We highly recommend watching it before this episode. In this episode we ask Ryan about the origins of Apache Iceberg and why he started the project. We cover some details of the Iceberg specification which is a nice change from the ad-hoc specification that people adhere to when using Hive tables. Then Ryan dives into several amazing use cases how Netflix and others use Iceberg.

PR of the week: PR 7233 Fix queries on tables without snapshot id

This week’s PR of the week was submitted by one of the Trino maintainers, Pratham Desai. Pratham is a Software Engineer at LinkedIn who commits a lot of time in the Trino community helping out on the slack channel, contributing code, and doing PR reviews. Thank you for all you do Pratham!

Had Brian known about this PR, he wouldn’t have had the issue he did with reading the empty snapshot created with the Iceberg Java API and would have been able to read and insert into the table just fine. If you come across this issue, we introduced this feature in release 344!

Another future development for the Trino Iceberg connector

Along with the future developments we discussed in the previous episode, another core Iceberg functionality that we want to add in Trino is support for partition migration. We also discussed future support for UPDATE and MERGE capabilities for the Iceberg connector.

Demo: Creating tables with Iceberg and reading the data in Trino

For this weeks’ demo, we continue to use the Iceberg Java API to create a table. You also have the option to use Trino, Spark, or other to ingest and query the data, but I wanted to use vanilla Iceberg API’s to experience the API and hopefully solidify my learning of Iceberg concepts in the process. Make sure you follow the instructions in the repository if you don’t have Docker or Java installed.

Let’s start up a local Trino coordinator and Hive metastore. Clone this repository and navigate to the iceberg/trino-iceberg-minio directory. Then start up the containers using Docker Compose.

git clone git@github.com:bitsondatadev/trino-getting-started.git

cd iceberg/trino-iceberg-minio

docker-compose up -d

In your favorite IDE, open the files under iceberg/iceberg-java into your project and run the IcebergMain class.

This class creates a logging table if it doesn’t exist along with the logging schema. Once you run this code, you can check to see that the table in Trino exists in the metastore under TABLE_PARAMS.

Now we transition from the Java API to running queries over Iceberg using Trino.

/**
 * This is the equivalent of running IcebergMain in the iceberg-java project.
 * Go ahead and inspect the java code you can use to interact with Iceberg
 * tables and metadata.
 */
CREATE TABLE iceberg.logging.logs (
   level varchar NOT NULL,
   event_time timestamp(6) with time zone NOT NULL,
   message varchar NOT NULL,
   call_stack array(varchar)
)
WITH (
   format = 'ORC',
   partitioning = ARRAY['hour(event_time)','level']
)

/**
 * Read From Trino
 */

SELECT * FROM iceberg.logging.logs;

/**
 * Write data from Trino and check data and snapshots
 */

INSERT INTO iceberg.logging.logs VALUES 
(
  'ERROR', 
  timestamp '2021-04-01' AT TIME ZONE 'America/Los_Angeles', 
  'Oh noes',
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
);

SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

/**
 * Write more data from Trino and check data and snapshots
 */
INSERT INTO iceberg.logging.logs 
VALUES 
(
  'ERROR', 
  timestamp '2021-04-01' AT TIME ZONE 'America/Los_Angeles', 
  'Oh noes', 
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
), 
(
  'ERROR', 
  timestamp '2021-04-01 15:55:23.383345' AT TIME ZONE 'America/Los_Angeles', 
  'Double oh noes', 
  ARRAY ['Exception in thread "main" java.lang.NullPointerException']
), 
(
  'WARN', 
  timestamp '2021-04-01 15:55:23.383345' AT TIME ZONE 'America/Los_Angeles', 
  'Maybeh oh noes?', 
  ARRAY ['bad things could be happening']
);

 
SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

/**
 * Read data from an old snapshot (Time travel)
 */

SELECT * FROM iceberg.logging."logs@2806470637437034115";

/**
 * Add new column, notice there is no snapshots of the metadata
 */

ALTER TABLE iceberg.logging.logs ADD COLUMN severity INTEGER;

SHOW CREATE TABLE iceberg.logging.logs;

SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

/**
 * Insert new data with new column
 */

INSERT INTO iceberg.logging.logs VALUES 
(
  'INFO', 
  timestamp '2021-04-01 19:59:59.999999' AT TIME ZONE 'America/Los_Angeles', 
  'es muy bueno', 
  ARRAY ['It is all normal'], 
  1
);

SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

/**
 * Rename column and drop column
 */

ALTER TABLE iceberg.logging.logs RENAME COLUMN severity TO priority;

SHOW CREATE TABLE iceberg.logging.logs;

SELECT * FROM iceberg.logging.logs;

ALTER TABLE iceberg.logging.logs DROP COLUMN priority;

SHOW CREATE TABLE iceberg.logging.logs;

SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

/**
 * Travel back to previous snapshots
 */

SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$snapshots";

SELECT * FROM iceberg.logging."logs@<insert-earlier-snapshot>";

CALL system.rollback_to_snapshot('logging', 'logs', <insert-earlier-snapshot>)

/**
 * Back to the future snapshot
 */

SELECT * FROM iceberg.logging."logs$snapshots";

SELECT * FROM iceberg.logging."logs@<insert-latest-snapshot>";

CALL system.rollback_to_snapshot('logging', 'logs', <insert-latest-snapshot>)

SELECT * FROM iceberg.logging.logs;

SELECT * FROM iceberg.logging."logs$partitions";

Question of the week: What do I do to restart the test pipeline if it fails on me?

When developing with Trino, there is an automated build that acts as verification of any PR. It is powered by a GitHub actions definition and runs all the tests in Trino when developers add new code. Sometimes test unrelated to the changes in your PR fail, which makes your PR show that it shouldn’t be merged due to a failure, but is actually unrelated.

Developers are aware of these flaky tests, and need a mechanism to resubmit their PR and rerun the tests. There is unfortunately no way to enable users to rerun tests through GitHub without write permissions to the Trino repository, so you have to do a dummy commit.

This can easily be done using this one line hack git commit --amend --no-edit && git push -f.

The good news is, we have gone through some extensive lengths to identify flaky tests in the last year. These test failures are much rarer now, and we are constantly improving the build stability as an ongoing effort.

Events, news, and various links

WTD Portland

Interested in supporting the Trino project, but don’t know where to start? A good place to start with a little less barrier to entry, is adding to the documentation. We will be supporting the writing day at the Write the Docs (WTD) Portland conference this April! Join us to learn how to get involved!

Virtual Trino meetups

Come join us for the inaugural Virtual Trino meetup on April 21st in the virtual meetup group in your region! See the community page for more details.

At these meetups, the four Trino/Presto founders will be updating everyone on the state of Trino. We’ll discuss the rebrand, talk about the recent features, and discuss the trajectory of the project. Then we will host a hangout and an ask me anything (AMA) session. Hope to see you all there!

Blogs

Videos

Trino Meetup groups

Virtual
East Coast (US)
- Trino Boston
- Trino NYC
West Coast (US)
- Trino San Fransisco
- Trino Los Angeles
Mid West (US)
- Trino Chicago

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

14: Iceberg: March of the Trinos

2021-04-01T00:00:00+00:00

March of the Trinos! Be careful Commander Bun Bun! That Iceberg doesn't look stable!
https://joshdata.me/iceberger.html

Iceberg links

Guests

David Phillips, creator of Trino/Presto, and CTO at Starburst (@electrum32)

Release 354

Release notes discussed: https://trino.io/docs/current/release/release-354.html

Martin’s list:

Support for OAuth 2.0 in CLI
Support for MemSQL 3.2
Pushdown of ORDER BY … LIMIT for MemSQL, MySQL and SQL Server connectors
Support for time(p) in SQL Server

Manfred’s notes:

LEFT, RIGHT and FULL JOIN
Preferred write partitioning on by default (needs statistics)
Small but useful fix on Elasticsearch (single value array)
Hive connector
Fix ACID table DELETE and UPDATE - critical fix is in! Boom!
Avro format improvement
CSV and Glue metadata improvement
Iceberg - date and timestamp improvement
CREATE SCHEMA fixes in MySQL, PostgreSQL, Redshift and SQL Server
Bunch of other fixes in those connectors

Concept of the week: Apache Iceberg and the table format

The Hive table format

For the last decade or so, big data professionals’ only option to query their data was to, in some way shape or form, use the Hive model. The Hive model is very simple, but it enabled running queries over files in a distributed file system.

To accomplish this, Hive uses a metastore service which stores and manages metadata. For Hive and Trino, this metadata acts as a pointer to the files containing the data, contains the file format, and has the column structure and types. This enabled Hive to query the correct files and data within those files for a SQL query. For more information on Hive’s architecture, read the Gentle Introduction to Hive blog. After the initial model gained adoption, Hive added other features such as partitioning. It uses the directory structures of the filesystems to split the files of data partitioned on a special column into different directories. We talk about this in more depth a few episodes back.

The Hive model solved some initial issues facing engineers in big data, but there were quite a few issues with this model. It is very rigid and not able to adapt to your needs as requirements change. For example, if you started partitioning your data splitting by date and segmenting by month, that table is stuck with that partitioning forever. The only way to update it is to create a new table with your new partition values, and migrate all of your data from the old table to the new table. With the common data sizes such a migration is often a long process, sometimes even impossible. Another issue stems from the separation of data stored in the metastore and data stored in the file system. The source of many issues in Hive is caused by the Hive metastore getting out of sync. A third but not final issue, is that running operations against the metastore is a timely process when running operations like list files on more modern object storage.

As all these problems amassed over the years, clearly something needed to be done. In the last few years, a few candidate table formats have come to the forefront of data engineering trends. Examples are, Apache Iceberg, Apache Hudi, and the proprietary Databricks’ Deltalake. The goal of these systems is to modernize the old Hive data structure. To Trino, Iceberg is particularly promising due to the list of promising features like schema versioning support and hidden partitioning that made it particularly attractive. Let’s talk about some of these features in detail.

The Iceberg table format

Iceberg, is a new table format developed at Netflix that aims to replace older table formats like Hive to add better flexibility as the schema evolves, atomic operations, speed, and just dependability. To be clear, it’s not a new file format, as it still uses ORC, Parquet, and Avro, but a table format. Netflix donated Iceberg to the Apache Software Foundation and it is now a top level project!

Iceberg handles both the data on disk just like Hive, but instead it stores the metadata in manifest files on disk along with the data itself. These manifest files are AVRO files that contain table metadata that lists a subset of data files. Manifest lists are a special type of manifest file that point to other manifest files. Snapshots contain a manifest list that points to all the manifest files that belong to the snapshot. Another huge difference from Hive is that the manifest files keep track of table data at the file level as opposed to directory level that Hive uses. By doing so, Iceberg avoids having to list all files in a directory, which becomes a very common and expensive operation.

By tracking files this way, we not only get better performance from object storage, it also enables serializable isolation. This addresses the lack of consistency between the metadata and file state experienced in Hive.

One of the greater advantages to Iceberg over Hive is the in-place table evolution. This means that you can add, drop, or rename a column, as well as, reorder and update a column without any expensive refactoring of tables or moving data around and there is no adverse effects on your data or metadata.

Partition evolution and hidden partitions are particularly invaluable. In Iceberg, the partition spec is a description of how to partition data in a table consisting of a list of source columns and transforms. Once the spec is created, it generates a partition tuple that is applied uniformly to the files created with that spec. Unlike Hive, that requires you to modify and send a special column that acts as the partition value, Iceberg stores partition values unmodified. Here’s an example partition spec generated in the Java API.

PartitionSpec spec = PartitionSpec.builderFor(schema)
        .hour("event_time")
        .identity("level")
        .build();

This example creates a separate hourly partition on the event_time field and use the identity() function on level to generate another level of partitioning on the level field. If at a later time, you decide you are getting too many small files because your partitions are too small, then you can update the partition spec and Iceberg starts writing new files by the updated spec. Again, this is all without creating a new table and moving data around and all the queries return correctly. This kind of evolution is a problem with Hive.

If all that isn’t enough, you can also do time travel and version rollback with Iceberg. As we mentioned above, Iceberg keeps track of various snapshots of your data in time through manifest files. As long as you keep those older snapshots around, the files associated with those snapshots stick around as well. This allows you to move around to previous views of the data. This is useful for testing, recovery, and many other purposes. Just as you can time travel, you can make the time travel permanent by rolling back any unintended changes and deleting the undesired snapshot.

Iceberg is also able to offer fast scan planning by filtering out the metadata files that are irrelevant to the scan, and using the partition spec to only find files containing responses to the data. Iceberg filters the metadata using partition value ranges and seeing if that is contained within the files of the metadata. Then while processing the list of manifest files, Iceberg will filter files by query predicates included in the partition, then apply column stats to help prune out files that don’t match. Iceberg also uses multiple concurrent writers to speed things up as a final measure.

Saving the best for last; Iceberg is a community standard and has a full written specification which is a nice change from Hive which is an ad-hoc specification that people adhere to in some ways. There have been many issues over the years due to the variance of how the unwritten specification gets interpreted. This not only enables people to understand how to use it, but documents how others can implement the same features with an entirely different systems. Let’s wait to do a deep dive on the spec for the next episode when we bring on Ryan Blue, creator of Iceberg, to dig into these details.

PR of the week: PR 1067 Add Iceberg connector

A huge shoutout goes to Parth Brahmbhatt, a Senior Software Engineer at Netflix who created this weeks’ PR of the week.

Release 318, introduced this code that supported querying tables from Apache Iceberg in Trino. While the code existed, the Iceberg connector code wasn’t officially released or documented until a little over a year later in release 341 once the connector reached maturity.

Future development for the Trino Iceberg connector

Still, some strange artifacts that we’re still facing today in the connector. For example, if you create a table with the Iceberg Java API, it creates Iceberg tables with <table_type, ICEBERG> but Trino creates and reads with <table_type, iceberg>. See Issue 1592 for status and details. In general, we can track some of the broader changes that are being made to the Iceberg connector here.

Demo: Creating tables with Iceberg and reading the data in Trino

For this weeks’ demo, I wanted to play around with the Iceberg Java API directly. You also have the option to use Trino, Spark, or other to ingest and query the data, but I wanted to use vanilla Iceberg API’s to experience the API and hopefully solidify my learning of Iceberg concepts in the process. Make sure you follow the instructions in the repository if you don’t have Docker or Java installed.

Let’s start up a local Trino coordinator and Hive metastore. Clone this repository and navigate to the iceberg/trino-iceberg-minio directory. Then start up the containers using Docker Compose.

git clone git@github.com:bitsondatadev/trino-getting-started.git

cd iceberg/trino-iceberg-minio

docker-compose up -d

In your favorite IDE, open the files under iceberg/iceberg-java into your project and run the IcebergMain class.

This class creates a logging table if it doesn’t exist. Once you run this code, you can check to see that the table in Trino exists in the metastore under TABLE_PARAMS. But, if run SHOW TABLES IN iceberg.logging; you’ll notice that the table doesn’t show up due to the issue we discussed above.

Let’s update the TABLE_PARAMS entry in the metastore db and then query the table again.

Question of the week: Why does Trino still depend on the Hive metastore if metadata for Iceberg saves to the filesystem?

We kept the metastore as many tests run around using the metastore that exist for the Hive connector, and we want to give the Iceberg connector ample time to mature before we migrate entirely away from the metastore. We also wanted to make the metastore the initial method of use in Iceberg that got developed as most developers initially would be migrating from their existing Hive catalog, and we wanted this transition to use existing tested components.

Currently, the metastore isn’t used the same way as in Hive. Trino stores a top-level directory that points to the metadata manifest file location and other statistics around the table in the TABLE_PARAMS table of the metastore. There is a pull request created by Jack Ye to migrate away from the requirement to use the Hive metastore when using Iceberg with Trino.

Tip of the Iceberg

Last bit of some fun with Iceberg. Let’s do a little experiment called, “Will the iceberg tip?”:

Go to https://iceberg.apache.org/ and take a look at the logo.
Now go to https://joshdata.me/iceberger.html.
Draw the Apache Iceberg logo and see what happens.
Now draw the iceberg in the image above that Commander Bun Bun is on.

When drawing the iceberg like the image with Commander Bun Bun, the iceberg tips over. Careful Commander Bun Bun! It looks like the Apache logo wins! Shout out to Joshua Tauberer for the web page. Shout out to Megan Thompson-Munson for the tweet that started the page. Shout out to Barton Wright from Manfred’s team of writers for being the geek to find this. Shout out to Ali for being a good sport and setting Command Bun Bun on the iceberg.

Events, news, and various links

Come join us for the inaugural Virtual Trino meetup on April 21st in the virtual meetup group in your region!

At this meetup, the four Trino/Presto founders will be updating everyone on the state of Trino. We’ll discuss the rebrand, talk about the recent features, and discuss the trajectory of the project. Then we will host a hangout and AMA. Hope to see you all there!

Blogs

Videos

Trino Meetup Groups

Virtual
East Coast (US)
- https://www.meetup.com/trino-boston/
- https://www.meetup.com/trino-nyc/
West Coast (US)
- https://www.meetup.com/trino-san-francisco/
- https://www.meetup.com/trino-los-angeles/
Mid West (US)
- https://www.meetup.com/trino-chicago/

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

13: Trino takes a sip of Pinot!

2021-03-18T00:00:00+00:00

Commander Bun Bun loves sippin' on Pinot after a hard day of data exploration!

Pinot links

Guests

Xiang Fu, project management chair and committer at Apache Pinot and co-founder of stealth mode startup (@xiangfu0)
Elon Azoulay, software engineer at stealth mode startup (@ElonAzoulay)

Release 353

Release notes discussed: https://trino.io/docs/current/release/release-353.html Martin’s list:

New ClickHouse connector
Support for correlated subqueries involving UNNEST
CREATE/DROP TABLE in BigQuery connector
Reading and writing column stats in Glue Metastore
Support for Apache Phoenix 5.1

Manfred’s notes:

New geometry functions
A whole bunch of correctness and performance improvements
Env var (and hence secrets) support for RPM-based installs
Hive - performance for bucketed table inserts
Kafka - schema registry improvements
Experimental join pushdown in a bunch of JDBC connectors
Also a bunch of fixes on JDBC connectors
Quite a list of changes on the SPI - ensure to check if you have a plugin

Concept of the week: Data cubes and Apache Pinot

Before diving into Pinot, I think it’s worthwhile to discuss some theoretical background to motivate some of the use cases Pinot solves for. We cover the concept of data cubes and how they are used in traditional data warehousing to speed up queries and minimize unnecessary work on your OLAP system.

Data cubes and MOLAP (Multi-dimensional online analytics processing)

In data analytics, there are many access patterns that tend to repeat themselves over and over again. It is very common to need to split and merge data based on the date and time values. Or perhaps you ask a lot of questions based on a specific customer, or even a specific product. Answering these questions typically involves aggregation of data like sums, averages, counts, etc… Wouldn’t it make sense to cache some of these intermediary results?

A common way to visualize the columns that are commonly bucketed to some values or range of values is to show them as a cube, that is sliced up into smaller dimensions. This actually derives from the traditional form of OLAP, multi-dimensional OLAP (MOLAP).

This cube represents a caching of data aggregations that are grouped by commonly used dimensions. For example, the displayed cube would be the pre-aggregation of the following query:

SELECT part, store, customer, COUNT(*)
FROM cube_table
GROUP BY part, store, customer

If we want to get the data for a particular customer, we can take a “slice” of that cube by specifying a particular customer. The following query returns the green square above from our cube.

SELECT part, store, COUNT(*)
FROM cube_table
WHERE customer = "Bob"
GROUP BY part, store

Now what if we want to flatten one of the dimensions? While this can be managed with a GROUP BY as before, but depending on the system may ignore any cached data and scan over all the rows. For this, SQL reserved a special set of keywords around cubes. We won’t dive into that in depth now, but for our current goal of flattening a dimension, we can use ROLLUP. Using the keyword ROLLUP indicates to the underlying system that you intend to aggregate over the pre-materialized data rather than scan over all rows to compute again. This gives you the total count of parts per store using the counts of the data cube.

SELECT part, store, COUNT(*)
FROM cube_table
GROUP BY ROLLUP (part, store)

Now, although we used simple counts, you can precompute a lot of other aggregate data like sums, min, max, percentile, etc… These can service various queries that are commonly queried and don’t require a new computation every time. That is the goal of MOLAP and data cubes.

Apache Pinot

Now let’s move on to Apache Pinot. It is a realtime distributed OLAP datastore, designed to answer OLAP queries with low latency. Although there may be a lot of words there that overlap with the Trino description, the key differentiators are realtime and low latency. Trino performs batch processing and is not a realtime system where Pinot is great for ingesting data in batch or stream. The other key word, low latency could technically apply to both Pinot and Trino but in the context of realtime subsecond latency, Trino is slow compared to Pinot. This is due to the specialized indexes that Pinot uses to store the data that we cover shortly. Importantly, another big distinction is that Trino does not store any data itself. It is purely a query engine. Xiang has a really great summary slide that easily shows the strengths of each system and why they work so well together.

While Trino is not as fast as Pinot, it is able to handle a broader set of use cases like performing broad joins over open data formats in data lakes. This is what motivated work on the Trino Pinot connector. You can have the speed of Pinot, while having the flexibility of Trino.

Now that you understand the common use case for Pinot, it’s important to know the main goals of Pinot.

One primary goal is the keep response times of aggregation queries predictable, regardless of how many requests Pinot handles. As it scales you won’t see a degradation of performance. This is achieved by Pinot’s custom indices and storage formats.
Another goal of Pinot is to revive the value of data from a historical context. Data reaches a particular point in its lifecycle where it becomes less valuable as it ages. While all data is able to add some value no matter what the age, there’s a tradeoff of scanning multiple rows to glean information from antiquated data. Pinot aims to remove this tradeoff as most questions around historical data are queried in aggregate and this can be summarized and queried at a low cost.
The final goal is to manage dimension explosion. One of the difficulties with managing a system that caches all this historic data is handling dimension explosion that occurs when you cache every possible combination of data. Above we showed a three-dimensional cube, but Pinot can handle a much larger number of dimensions. However, just because you can, doesn’t mean you should. Pinot has a lot of smarts around using the data, and some good defaults to determine the maximum number of buckets per dimension. This helps balance an exploding cache yet maintains fast results.

Pinot architecture

We just covered Pinot theory and goals, let’s take a quick look at the architecture.

A Pinot cluster consists of a controller, broker, server, and optionally a minion to purge data.

PR of the week: PR 2028 Add Pinot connector

Our guest on the show today, Elon Azoulay, is the author of this PR, so we can ask him all about it now.

Basic configuration (Pinot controller url, Pinot segment limit)
2 ways to connect to Pinot - broker and server, and their tradeoffs (i.e. segment limit for server)
Talk about broker passthrough queries, i.e select * from “select … from pinot_table …
Server limit that we eventually want to eliminate broker query parsing
- How to crash the Pinot server.
- Streaming server alternative

Future Pinot features in Trino

Aggregation pushdown (PR 6069)

Pinot insert (PR 7162)
Pinot create table (PR 7164)
Pinot drop table (PR 7160)
Pinot 6 (PR 7163)
Pinot filter clause parsing (see question of the week below)

Demo: Pinot batch insertion and query using Trino Pinot connector

To put this PR to the test, we set up a Pinot cluster using Docker Compose.

To load the data, we’re going to use a simple batch import, but you can also insert the data in a stream using Kafka.

Let’s start up the Pinot cluster along with the required Zookeeper and Kafka broker. Clone this repository and navigate to the pinot/trino-pinot directory.

git clone git@github.com:bitsondatadev/trino-getting-started.git

cd community_tutorials/pinot/trino-pinot

docker-compose up -d

To do batch insert, we will stage a csv file to read the data in. Create a directory underneath a temp folder locally and then submit this to Pinot.

mkdir -p /tmp/pinot-quick-start/rawdata

echo "studentID,firstName,lastName,gender,subject,score,timestampInEpoch
200,Lucy,Smith,Female,Maths,3.8,1570863600000
200,Lucy,Smith,Female,English,3.5,1571036400000
201,Bob,King,Male,Maths,3.2,1571900400000
202,Nick,Young,Male,Physics,3.6,1572418800000" > /tmp/pinot-quick-start/rawdata/transcript.csv

In order for Pinot to understand the CSV data, we must provide it a schema.

echo "{
    \"schemaName\": \"transcript\",
    \"dimensionFieldSpecs\": [
      {
        \"name\": \"studentID\",
        \"dataType\": \"INT\"
      },
      {
        \"name\": \"firstName\",
        \"dataType\": \"STRING\"
      },
      {
        \"name\": \"lastName\",
        \"dataType\": \"STRING\"
      },
      {
        \"name\": \"gender\",
        \"dataType\": \"STRING\"
      },
      {
        \"name\": \"subject\",
        \"dataType\": \"STRING\"
      }
    ],
    \"metricFieldSpecs\": [
      {
        \"name\": \"score\",
        \"dataType\": \"FLOAT\"
      }
    ],
    \"dateTimeFieldSpecs\": [{
      \"name\": \"timestampInEpoch\",
      \"dataType\": \"LONG\",
      \"format\" : \"1:MILLISECONDS:EPOCH\",
      \"granularity\": \"1:MILLISECONDS\"
    }]
}" > /tmp/pinot-quick-start/transcript-schema.json

Now we are almost ready to create the table. Instead of adding table configurations as part of the SQL command, Pinot enables you to store table configurations as a file. This is a nice option that decouples the DDL which makes for simpler scripting in batch setups.

echo "{
    \"tableName\": \"transcript\",
    \"segmentsConfig\" : {
      \"timeColumnName\": \"timestampInEpoch\",
      \"timeType\": \"MILLISECONDS\",
      \"replication\" : \"1\",
      \"schemaName\" : \"transcript\"
    },
    \"tableIndexConfig\" : {
      \"invertedIndexColumns\" : [],
      \"loadMode\"  : \"MMAP\"
    },
    \"tenants\" : {
      \"broker\":\"DefaultTenant\",
      \"server\":\"DefaultTenant\"
    },
    \"tableType\":\"OFFLINE\",
    \"metadata\": {}
}" > /tmp/pinot-quick-start/transcript-table-offline.json

Once you create these three files and verify that docker containers are running, we can now run the Add Table command:

docker run --rm -ti \
    --network=trino-pinot_trino-network \
    -v /tmp/pinot-quick-start:/tmp/pinot-quick-start \
    --name pinot-batch-table-creation \
    apachepinot/pinot:latest AddTable \
    -schemaFile /tmp/pinot-quick-start/transcript-schema.json \
    -tableConfigFile /tmp/pinot-quick-start/transcript-table-offline.json \
    -controllerHost pinot-controller \
    -controllerPort 9000 -exec

Now that the table exists, we can see it in the Pinot web UI. Let’s insert some data using a batch job specification:

echo "executionFrameworkSpec:
  name: 'standalone'
  segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
  segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
  segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '/tmp/pinot-quick-start/rawdata/'
includeFileNamePattern: 'glob:**/*.csv'
outputDirURI: '/tmp/pinot-quick-start/segments/'
overwriteOutput: true
pinotFSSpecs:
  - scheme: file
    className: org.apache.pinot.spi.filesystem.LocalPinotFS
recordReaderSpec:
  dataFormat: 'csv'
  className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
  configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig'
tableSpec:
  tableName: 'transcript'
  schemaURI: 'http://pinot-controller:9000/tables/transcript/schema'
  tableConfigURI: 'http://pinot-controller:9000/tables/transcript'
pinotClusterSpecs:
  - controllerURI: 'http://pinot-controller:9000'" > /tmp/pinot-quick-start/docker-job-spec.yml

Now run this batch job by running the LaunchDataIngestionJob task.

docker run --rm -ti \
    --network=trino-pinot_trino-network \
    -v /tmp/pinot-quick-start:/tmp/pinot-quick-start \
    --name pinot-data-ingestion-job \
    apachepinot/pinot:latest LaunchDataIngestionJob \
    -jobSpecFile /tmp/pinot-quick-start/docker-job-spec.yml

We modified this demo from the tutorials available on the Pinot website:

Question of the week: Why does my passthrough query not work in the Pinot connector?

The passthrough queries may be failing due to upper case constants that need to be surrounded with UPPER(). For example 'Foo' in this query would be rendered as all lowercase once it is passed to Pinot:

SELECT * 
FROM "SELECT col1, col2, COUNT(*) FROM pinot_table WHERE col2 = 'FOO' GROUP BY col1, col2"

The fix is to pass 'Foo' to UPPER() in the passthrough query.

SELECT * 
FROM "SELECT col1, col2, COUNT(*) FROM pinot_table WHERE col2 = UPPER('FOO') GROUP BY col1, col2"

It could also be due to parsing of functions in filters. A workaround is to put the filter outside of the double quotes, which can work in some cases. For example, column table names can be mixed case as the connector will auto resolve them. If there are mixed case constants would not work with upper():

SELECT * 
FROM "SELECT col1, col2, COUNT(*) FROM pinot_table WHERE col2 = 'Foo' GROUP BY col1, col2"

The filter can be hoisted into the outer query:

SELECT * 
FROM "SELECT col1, col2, COUNT(*) FROM pinot_table GROUP BY col1, col2" WHERE col2 = 'Foo';

There is ongoing work to improve this parsing: Pinot filter clause parsing (PR 7161).

Events, news, and various links

Blogs

Trino Meetup Groups

Virtual
- https://www.meetup.com/trino-americas/
- https://www.meetup.com/trino-emea/
- Trino APAC - Coming Soon
East Coast
- https://www.meetup.com/trino-boston/
- https://www.meetup.com/trino-nyc/
West Coast
- https://www.meetup.com/trino-san-francisco/
- https://www.meetup.com/trino-los-angeles/
Mid West
- https://www.meetup.com/trino-chicago/

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

12: Trino gets super visual with Apache Superset!

2021-03-04T00:00:00+00:00

Guests

Srini Kadamati, Developer Advocate at Preset (@SriniKadamati)
Dr. Beto Dealmeida, Staff Engineer at Preset (@dealmeida)

Release 353 – Almost

353 is right around the corner. Last show we said this would be a small release. While there was a correctness issue we resolved, there didn’t seem to be much demand to get it out quick as we initially thought. So it was decided to continue adding more features to 353. It should be coming out shortly!

Concept of the week: Trino clients, Python, and Apache Superset

What is the general data flow from a connected data source?

Trino workers request data from the data source with specific connector
Workers process data and send it to the coordinator
Coordinator does final processing
Supplies the data via HTTP / REST stream to requestor
Requestor is a “client” such as JDBC driver, or Trino CLI
Client translates data further and provides to application (Java application using JDBC driver) or user interface/directly to user (output in CLI)
User views part of data and scrolls down
Client requests more data from coordinator via HTTP / REST (and see above)

What clients are provided by Trino project?

What other clients are there?

ODBC driver from Starburst
Various other clients from the open source community
- R
- NodeJS/Javascript

What happens in the Python world?

Disclaimer: I am not a Pythonista or Pythoneer.

DB-API 2.0
- PEP 249 https://www.python.org/dev/peps/pep-0249/
- Python standard library
trino-python-client
- Wraps complexity of Trino HTTP / REST
- Supports authentication and such
- Provides DB API endpoints / implementation
- Preferred method to query Trino
SQLAlchemy https://www.sqlalchemy.org/
- SQL toolkit
- ORM mapper
- Widely used, eg. in Apache Superset
- Supports dialects
PyHive
- Not really a SQL wrapper
- Aimed at Hive QL
- Only kind of useful for Trino, limited compatibility
JDBC driver (Java !) and PySpark
- Possible, but a hack really
PyJDBC
- Wraps DB API around any JDBC driver
- Kind of a hack since it goes through JDBC to HTTP, when Trino python client does the same more directly
PyODBC
- Similar hack to PyJDBC
Potentially also possible to talk to via HTTP directly
- That’s like reimplementing the trino-python-client
- Also see question of the week later

Beyond that, it will vary from application to application.

Let’s find out from our guests how this hangs together in Apache Superset, since it is using Python.

PR of the week: Superset PR 13105 feat: first step native support Trino

In this week’s pull request https://github.com/apache/superset/pull/13105 that was graciously added by dungdm93.

The first thing we need to understand about this addition is the concept of a database engine in Superset. A database engine handles a lot of the custom interactions between various databases and maps them to the interface that Superset understands. If certain concepts are missing in a certain database, like time granularity or SQL syntax, the database engine for that database indicated to Superset that this is not available. As a result the option does not show in Superset, or a concise error message is reported. By default, database engines use the base.py methods, but each engine, like Trino, add the custom mappings with a specific engine implementation, trino.py.

The pull request adds a few basic custom changes to enable Trino usage with Superset. One change ensures that complex timestamps from Trino are truncated to a format that Superset is able to support during time aggregation operations.

This opens a vast amount of functionality for using Trino and Superset. We wanted to feature this because it goes to show how a small code change, even one that is not in the Trino repository, can have a vast effect on those using Superset and Trino.

Thank you so much to dungdm93 for making this change and further linking Trino into a fantastic project like Apache Superset!

Demo: Superset querying Trino to create visualization dashboard

To put this PR to the test, we need to connect Apache Superset to Trino as our datasource.

First, you need to follow these instructions to install Docker (if you don’t already have it installed), and then clone the Superset repository:

$ git clone https://github.com/apache/superset.git

Next, you need to set up the database driver for Trino. Navigate to the root directory of the local Superset repository you just downloaded and run the following.

echo "sqlalchemy-trino" >> ./docker/requirements-local.txt

This tells Superset scripts to install the sqlalchemy-trino library upon startup. We know the name by looking up the Trino driver page for the driver documentation and how to use the connection string. If you were to install these directly on a Superset node, you would refer to this database drivers page.

Now run the following command to start up Superset and make sure you’re in the root folder of the repo.

docker-compose -f docker-compose-non-dev.yml up.

After Superset is running, you need to start Trino as well. We did so using a separate docker-compose app.

As soon as this is done, you can navigate to Superset’s homepage http://localhost:8088 and scroll to the Data > Databases menu.

Click the +Database button.

Set Name to “Trino” and URI to trino://trino@host.docker.internal:8080 and click Add.

If you want to allow CTAS, CVAS, or DML operations, you’ll want to edit the Database you just created and click on the SQL LAB SETTINGS tab and select in the operations you want to allow.

Connection settings that allows for creation/manipulation of tables.

You should be able to verify under SQL Lab > SQL Editor and run a SELECT query.

We cover adding charts and creating a dashboard in the show. We linked some blogs from Preset around how to do a lot of this workflow in great detail. Find these blogs linked below! Here’s a taste of what we created in Superset with some BTS On-Time : Reporting Carrier On-Time Performance (1987-present) and Covid Cases reported by the CDC.

COVID-19 and flights data dashboard!

Question of the week: How do I use the Trino REST api?

I want to just use the REST API of Trino. Where is the documentation? How do I do that?

The short answer:

Don’t do that. Use a Trino client instead.

The long answer:

The typical desired use case for using the REST API is to run a query and get the result. However that part of the API is not really a traditional REST API (HTTP POST, HTTP GET). That just doesn’t work for large datasets to be returned. Instead, it is a constant open connection and stream of data and interaction between client and Trino.

The clients take care of all this complexity and provide it in standard API for the various platforms (JDBC, …). Use the clients!

And if there is no client, or the existing client is not good enough. Create an open source one or contribute improvements.

The exception:

There are other simple, pure REST API endpoints that you can use just straight out of the box. Try http://localhost:8080/v1/info or http://localhost:8080/v1/status. You could use those for a liveness/readiness probe in k8s or for cluster status display. By the way, the Web UI uses those and others..

Last note

If you really can’t help yourself, here are some docs. https://github.com/trinodb/trino/wiki/HTTP-Protocol

Events, news, and various links

Blogs

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

11: Dynamic filtering and dynamic partition pruning

2021-02-18T00:00:00+00:00

Release 352

Release notes discussed: https://trino.io/docs/current/release/release-352.html

No new release to discuss yet except that 353 will be around the corner to fix a low-impact correctness issue that came out in 352 https://github.com/trinodb/trino/pull/6895.

Concept of the week: Dynamic filtering

So we’ve covered a lot on the Trino Community Broadcast to build our way up to tackling this pretty big subject in the space called dynamic filtering. If you haven’t seen episodes five through nine, you may want to go back and watch those for some context for this episode. Episode eight actually diverted to the Trino rebrand so we won’t discuss that one. For the recap;

In episode five, we spoke about Hive partitions. In order to save you time when you run a query, Hive stores data under directories named by the values of the data written underneath that directory. Take this directory structure for the orders table partitioned by the orderdate field:

orders
├── orderdate=1992-01-01
│   ├── orders_1992-01-01_1.orc
│   ├── orders_1992-01-01_2.orc
│   ├── orders_1992-01-01_3.orc
│   └── ...
├── orderdate=1992-01-02
│   └── ...
├── orderdate=1992-01-03
│   └── ...
└── ...

When querying for data under January 1st, 1992, according to the Hive model, query engines like Hive and Trino will only scan ORC files under the orders/orderdate=1992-01-01 directory. The idea is to avoid scanning unnecessary data by grouping rows based on a field commonly used in a query.

In episode six and seven, we discussed a bit about how a query gets represented internally to Trino once you submit your SQL query. First, the Parser converts SQL to an abstract syntax tree (AST) format. Then the planner generates a different tree structure called the intermediate representation (IR) that contains nodes representing the steps that need to be performed in order to answer the query. The leaves of the tree get executed first, and the parents of each node are dependent on the action of its child completing before it can start. Finally, the planner and cost-based-optimizer (CBO) runs various updates on the IR to optimize the query plan until it is ready to be executed. To sum it all up, the planner and CBO generate and optimize the plan by running optimization rules. Refer to chapter four in Trino: The Definitive Guide pg. 50 for more information.

In episode nine, we discussed how hash-joins work by first drawing a nested-loop analogy to how joins work. We then discussed how it is advantageous to read the inner loop into memory to avoid a lot of extra disk calls. Since it is ideal to read an entire table into memory, you likely want to make sure the table that is built in memory is the smaller size of the two tables. This smaller table called the build table. The table that gets streamed is called the probe table. We discussed a bit how hash-joins work which is a common mechanism to execute joins in a distributed and parallel fashion.

Another nomenclature akin to build table and probe tables are dimension and fact table, respectively. This nomenclature comes from the star schema from data warehousing. Typically, there are large tables called fact tables would live at the center of the schema. These tables typically have many foreign keys, and a bit of quantitative or measuarable columns of the event or instance. The foreign keys connect these big fact tables to smaller dimension tables that, when joined, provide human readable context to enrich the recordings in the fact table. The schema ends up looking like a star with the fact table at the center. In essence, you just need to remember when someone is describing a fact table they are saying it is a bigger table that is likely going to end up on the probe side of a join, where a dimension is more likely a candidate to fit into memory on the build side of a join.

So let’s get onto the dynamic filtering shall we? First, let’s cover a few concepts about dynamic filtering, then compare some variations of this concept.

Dynamic filtering takes advantage of joins with big fact tables to smaller dimension tables. What makes this filtering different from other types of filtering is that you are using the smaller build table that is loaded at query time to generate a list of values that exist in the join column between the build table and probe table. We know that only values that match these criteria are going to be returned from the probe side, so we can use this dynamically generated list as a pushdown predicate on the join column of the probe side. This means we are still scanning this data, but only sending the subset that answers the query. We can look at the blog written for the original local dynamic filtering implementation by Roman Zeyde for more insights on the original implementation for dynamic filtering before Raunaq’s changes.

Local dynamic filtering is definitely beneficial as it allows skipping unnecessary stripes or row-groups in the ORC or Parquet reader. However, it works only for broadcast joins, and its effectiveness depends upon the selectivity of the min and max indices maintained in ORC or Parquet files. What if we could prune entire partitions from the query execution based on dynamic filters? In the next iteration of dynamic filtering, called dynamic partition pruning, we do just that. We take advantage of the partitioned layout of Hive tables to avoid generating splits on partitions that won’t exist in the final query result. The coordinator can identify partitions for pruning based on the dynamic filters sent to it from the workers processing the build side of join. This only works if the query contains a join condition on a column that is partitioned.

With that basic understanding, let’s move on to the PR that implement dynamic partition pruning!

PR of the week: PR 1072 Implement dynamic partition pruning

In this week’s pull request https://github.com/trinodb/trino/pull/1072 we return with Raunaq Morarka and Karol Sobczak. This PR effectively brings in the second iteration of dynamic filtering, dynamic partition pruning, where instead of relying on local dynamic filtering we collect dynamic filters from the workers in the coordinator and prune out extra splits that aren’t needed with the partition layout of the probe side table. A query like this for example, seen in Raunaq’s blog about dynamic partition pruning shows that if we partition store_sales on ss_sold_date_sk we can take advantage of this information by sending it to the coordinator.

SELECT COUNT(*) FROM 
sales JOIN items ON sales.item_id = date_dim.items.id
WHERE items.price > 1000;

Below we show how the execution of this would look in a distributed manner if you partitioned the sales table on item_id. This is a visual reference for those listening in on the podcast:

1:
Query is sent to the coordinator to be parsed, analyzed, and planned.

2:
All workers get a subset of the items (build) table and each worker filters out items with price > 1000.

3:
All workers create dynamic filter for their item subset and send it to the coordinator.

4:
Coordinator uses dynamic filter list to prune out splits and partitions that do not overlap with the DF and submits splits to run on workers.

5:
Workers run splits over the sales (probe) table.

6:
Workers return final rows to be assembled into the final result on the coordinator.

PR Demo: PR 1072 Implement dynamic partition pruning

For this PR demo, we have set up one r5.4xlarge coordinator and four r5.4xlarge workers in a cluster. We have a sf100 size tpcds dataset. We will run some of the TPC-DS queries and perhaps a few others.

The first query we run through in the TPC-DS queries was query 54. With this query, we are using the hive catalog pointing to AWS S3 and AWS Glue as our metastore. We initially disable dynamic filtering then compare it to the times when dynamic filtering is enabled. With dynamic filtering we find the query to run at about 92 seconds, where with dynamic filtering it runs for 42 seconds. We see similar findings for the semijoin we execute below and discuss some implications of how the planner actually optimizes the semijoin into an inner join.

/* turn dynamic filtering on or off to compare */
SET SESSION enable_dynamic_filtering=false;

SELECT ss_sold_date_sk, COUNT(*) from store_sales WHERE ss_sold_date_sk IN (
  SELECT ws_sold_date_sk FROM (
    SELECT ws_sold_date_sk, COUNT(*) FROM web_sales GROUP BY 1 ORDER BY 2 LIMIT 100
  )
)
GROUP BY 1

Events, news, and various links

Blogs

Upcoming events

Big Data Technology Warsaw Summit - Workshop Feb 23 - 24 https://bigdatatechwarsaw.eu/agenda/
Big Data Technology Warsaw Summit - Conference Feb 25 - 26 https://bigdatatechwarsaw.eu/agenda/

Past Events

Starburst Datanova - on demand https://www.starburst.io/info/datanova/

Latest training from David, Dain, and Martin(Now with timestamps!):

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

10: Naming the bunny!

2021-02-04T00:00:00+00:00

Release 352

Release Notes discussed: https://trino.io/docs/current/release/release-352.html At the time of recording 352 was not out yet. We will discuss a few of the changes coming down the pipeline to look forward to!

Naming our new bunny!

That’s right, you submitted your names, and we are now happy to announce the top names that were chosen, and we will choose the name by a community poll.

The running names are:

Lepi: short for Lepus, the constellation under Orion that is in the shape of a bunny and said to be chased by Orion or Orion’s dogs. They cannot catch it because the bunny is fast https://en.wikipedia.org/wiki/Lepus_(constellation).
Neut: early name used by community members through hearsay to unofficially name the bunny until it had a real name. This name, which is a portmanteau when combined with Trino (Neut-Trino) became popular among a few members.
Nu: math symbol, with a similar prefix use of Nu + Trino to refer to the neutrino origins. Also in physics nu represents any of three kinds of neutrino in particle physics.
Commander Bun Bun: a name suggested by a community member’s child who loves the bunny!

Events, news, and various links

Blogs

Upcoming events

Feb 9 - Feb 10 http://starburstdata.com/datanova

Latest training from David, Dain, and Martin(Now with timestamps!):

Presto® Summit Series - Real world usage

Podcasts:

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

9: Distributed hash-joins, and how to migrate to Trino

2021-01-21T00:00:00+00:00

Release 351

Release Notes discussed: https://trino.io/docs/current/release/release-351.html

This release was really all about renaming everything from a client perspective to use Trino instead of Presto. Manfred will cover all the work that was done to do this for the release

Question of the week: How do I migrate from presto releases earlier than 350 to Trino releases 351?

https://trino.io/blog/2021/01/04/migrating-from-prestosql-to-trino.html

Concept of the week: Distributed Hash-join

Joins are one of the most useful and powerful operations performed by databases. There are many approaches to joining data. Various types of indices can facilitate joins. The order in which a join gets executed can vary depending on geographic distribution of the data, selectivity of the query where the fewer rows that get returned from a query the higher the selectivity, and the information available from indexes and table statistics can inform an execution engine how to form a query. One thing that stays consistent about virtually every query engine in the world is that they occur over two tables at a time no matter how many tables exist in the query. Some joins may occur in parallel but any given join will only involve two tables.

If you wrote a simple program that did what a join does, it might look something like a nested loop:

public class CartesianProductNestedLoop {
    public static void main(String[] args) {
        int[] outerTable = {2, 4, 6, 8, 10, 12};
        int[] innerTable = {1, 2, 3, 4};

        for (int o : outerTable) {
            for (int i : innerTable) {
                System.out.println(o + ", " + i);
            }
        }
    }
}

Since there is no predicate such as something you would see in a WHERE clause, the join returns the cartesian product of these two tables. It is useful also to portray these joins in relation algbegra. For example, the join above is written as $O \times I$ where $O$ is the outer table and $I$ is the inner table. $\times$ indicates that the join we are using is the cartesian product as we see below. Another useful way to view this is to visualize the join as a graph.

NOTE: When using relational algebra or using a graph to represent a join, it is convention that the table in the outer loop of this join is always shown on the left. This distinction becomes important as you will see below.

Here is the output from the cartesian product join above.

Notice also that we are treating these tables the same since we have to read each of the values to print out the cartesian product it doesn’t make a difference which table is the inner table and which is the outer yet. We could swap these tables for inner and outer and still get the same performance of $O (n^2)$.

Now, what if you did have some criteria that filtered out some rows that get returned from this product. Since it is quite common to join tables by an id, the most common criteria for a join is that the values are equal since values in rows with matching ids are related. Initially we can get away with just adding an if statement, print when true, and be done with it. Let’s do that.

public class NaturalJoinNestedLoop {
    public static void main(String[] args) {
        int[] outerTable = {2, 4, 6, 8, 10, 12};
        int[] innerTable = {1, 2, 3, 4};

        for (int o : outerTable) {
            for (int i : innerTable) {
                if(o == i){
                    System.out.println(o + ", " + i);
                }
            }
        }
    }
}

Lets assume that the integers in these tables are values of a column called id in both tables that uniquely identify a row in each table. When you have a commonly named column like this, the operation of joining based on columns that share the same name is a natural join. In relational algebra it is denoted with a litte bowtie, for example, $O \bowtie I$. We could also use the Equi-join notation that specifies the exact join columns $O \bowtie_(O.id = I.id) I$. The graph will look about the same as before but we only change the operation we are performing.

Now we only get the output of two rows as we should expect.

2, 2
4, 4

One important aspect that that gets glossed over in this simple example is that the data is small and in memory versus a database initially has to retrieve the data from disk. Reading values from a disk using random access is 100,000 times faster on memory. That being said, it’s really important to consider the fact that reading the values over and over again is going to be an exponential exercise, multiplied by 100,000.

It would be better if we could read one table into memory once, and reuse those values as you scan over the data of the other table. There is a common name for both of these. Trino first reads the inner table into memory, to avoid having to read this table for each row in the outer table. We call this table the build table, as with the first scan you build the table in memory. Trino then streams the rows from the outer table and performs the join with the build table. We call this table the probe table.

import java.util.*;

public class BuildProbeLoops {
    public static void main(String[] args) {
        int[] probeTable = {2, 4, 6, 8, 10, 12};
        int[] buildTable = {1, 2, 3, 4};
        Map<Integer, Integer> buildTableCache = new HashMap<>();

        for (int row : buildTable) {
            //in this case the row is actually just the join column
            int hash = row;

            buildTableCache.put(hash, row);
        }

        for (int row : probeTable) {
            //in this case the row is actually just the join column
            int hash = row;

            Integer buildRow = buildTableCache.get(hash);
            if(buildRow != null){
                System.out.println(buildRow + ", " + row);
            }
        }
    }
}

While it may seem redundant to do all of this extra work for this simple example, this saves minutes to hours when reading from disk and the data you are reading is big enough. The runtime complexity has now dropped from $O(n^2)$ to just a linear runtime of $O(n)$. The relational algebra for this table is now $P \bowtie B$, where $P$ is the probe table and $B$ is the build table. Notice the relational algebra for this hasn’t changed, we just now specify that we do a build on the inner table and probe the outer table.

One thing to consider is the size of each table, if we are fitting one of the tables into memory, it’s probably best we choose the smaller table to use as the build table. Hopefully this helps you understand now why we now specify between a build and a probe table. This will help in our discussions about query optimization and dynamic filtering which we will discuss on the next show.

Another interesting subtopic of this that we won’t get into today are left -deep and right-deep plans. Since now we know that the probe table is always on the left and our build table is on the right, the shape of our query matters. Consider the difference between these two trees.

The left-deep tree vs right-deep trees have big implications on the speed of the query. This is a bit tangential for our talk today. Let’s finally move on to hash-joins!

In Trino, a hash-join is the common algorithm that is used to join tables. In fact the last snippet of code is really all that is invovled in implementing a hash-join. So in explaining probe and build, we have already covered how the algorithm works conceptually.

The big difference is that trino implements a distributed hash-join over two types of parallelism.

Joined tables are distributed over the worker nodes to achieve inter-node parallelism. Instead of the hash value simply being used to match with other rows, it is also used to route to specific Trino worker nodes. Rows that meet the equijoin criteria then are processed by the workers for a set of ids.
Within the node, workers can use the hash to further distribute the rows across multithreaded applications. This intranode-parallelism allows for there to be a single thread for every hash partition.
Finally, once all of these threads are finished determining which rows pass the join criteria, the probe side then begins to emit rows in larger batches, which can quickly be thrown out or kept based on which partitions exist on a given worker.

Great resources on this topic where some of the examples above derive:

How to contribute documentation and testimonials

Instead of a PR this week Manfred discusses some notes on how to contribute to documentation and testimonials.

If you want to show us some 💕, please give us a ⭐ on Github.

Events, news, and various links

Blogs

Upcoming events

Feb 9 - Feb 10 http://starburstdata.com/datanova

Latest training from David, Dain, and Martin(Now with timestamps!):

Presto® Summit Series - Real world usage

Podcasts:

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

8: Trino: A ludicrously fast query engine: past, present, and future

2021-01-11T00:00:00+00:00

In this episode…

Well, we’re back, and no longer waving the Presto® flag like we did before. If you haven’t heard, Presto® SQL is now Trino ( Read more about that here). In this episode, we sit down with the four original creators of Presto® and discuss the journey in more detail of what led us to our current trajectory with the Presto® SQL project and why that is now being renamed to Trino. We also discuss how this affects those that are using Trino. If you are developing on Trino and have the old namespace check out the guide to migrate here.

We also discuss the differences between the two projects. It is actually a lot after two years since the split, and we recommend looking at the blog we wrote at the end of 2019 and keep your eyes peeled for the blog we are writing to summarize the changes in 2020!

Finally, we cover some sneak peeks at the roadmap for Trino in 2021.

If you want to show us some 💕, please give us a ⭐ on Github.

Events, news, and various links

Blogs

Upcoming events

Feb 9 - Feb 10 http://starburstdata.com/datanova

Latest training from David, Dain, and Martin(Now with timestamps!):

Presto® Summit Series - Real world usage

Podcasts:

If you want to learn more about Trino, check out the definitive guide from OReilly. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

7: Cost Based Optimizer, Decorrelate subqueries, and does Presto make my RDBMS faster?

2020-11-30T00:00:00+00:00

Release 348

Release Notes discussed: https://prestosql.io/docs/current/release/release-348.html

Martin’s announcement:

Support for OAuth2 authorization in Web UI
Support for S3 streaming uploads
Support for DISTINCT aggregations in correlated subqueries
Performance improvement for ORDER BY … LIMIT queries
Many improvements and bug fixes to JDBC driver

Manfred’s observations:

SHOW STATS to play around with
switch for Hive view translation off, legacy or new coral system
a bunch of other Hive connector improvements
Iceberg on GCP and Azure
Small SPI changes

Concept of the week: Cost Based Optimizer

We’re continuing our series covering some fundamental topics that build up to dynamic filtering! This week we’re discussing the cost-based optimizer with Presto co-creator Martin Traverso!

Parser/Analyzer

To recap, in episode 6 we discussed a little bit about the various forms a query takes from submission to the coordinator, to actually being executed. We discussed how the parser generates an abstract syntax tree (AST) and the analyzer checks for valid SQL including functions and making sure tables and columns being referenced actually exist.

Here’s an example of an abstract syntax tree from last weeks episode for query SELECT * FROM (VALUES 1) t(a) WHERE a = 1 OR 1 = a OR a = 1;.

Planner

The next phase we discussed was the planner. Internally, the planner and optimizer overlap substantially, but you can think of the planner as the early part of the planning phase that generates the logical query, and over several optimization iterations becomes an optimized distributed query. The planner generates a new tree data structure called the plan IR (intermediate representation) that contains nodes representing the steps that need to be performed in order to answer the query. The leaves of the tree get executed first, and the parents of each node are dependent on the action of its child completing before it can start.

Here’s ab example of a logical plan tree using the same query form the AST above. Since this query isn’t pulling from a data source, the distributed plan is equivalent to the logical plan.

Cost-Based Optimizer (CBO)

In the cost-based optimizer phase, there are various rules that are applied to the Plan IR that slowly optimize the structure into the final distributed plan that is then executed. To do this, the optimizer retrieves some statistical metadata of the tables and their data. This information includes, table row counts, column data size, column low/high value, distinct column value count, and the percentage of null values in a column. With the list of rules that aim to leverage these statistics, the optimizer improves the query structure that improves on parallelism based on the number of workers to the number of sources.

If you want to jump into the code, start at the entry point for the planner/optimizer and the initial planning starts on this line. This loop is where the actual optimization occurs. So if you are interested, maybe grab a brandy 🥃 and take some time to set your debugger at these points and watch the optimizer do its thing!

Refer to chapter 4 in Trino: The Definitive Guide pg. 50.

PR of the week: PR 1415 Decorrelate subqueries with Limit or TopN

In this week’s pull request https://github.com/prestosql/presto/pull/1415, done by Presto contributer and Starburst Engineer kasiafi.

Before we can jump into this PR, let’s discuss what a subquery is and further what a correlated subquery is. In SQL you have a nested query that runs within another query, typically embedded within a WHERE clause or SELECT statement. Take this query for example:

SELECT a, b, c
FROM table
WHERE a > (SELECT t2.a FROM table2 t2 WHERE t2.d < 60);

In this example, we have a standard non-correlated subquery that runs on table2. The reason it is not correlated is because there are no dependencies on the parent query that is being run on table. This type of query enables the SQL engine to run the subquery first and then use those results to run the parent query after. In the case of a correlated query, you typically have at least one criterion in the nested query that depends on the parent. This requires that the nested query gets executed for each row of the parent query. Take a look at this correlated query:

SELECT a, b, c
FROM table t1
WHERE a > (SELECT t2.a FROM table2 t2 WHERE t2.d < 60 AND t1.b <> t2.b);

In this example, we are running the subquery in the context of the row in order to evaluate the value of t1.b. Having this query run for every row of the parent query is certainly not ideal if it is not required and that is why subquery decorrelation is a common optimization technique if an equivalent non-correlated subquery exists for a given correlated subquery.

This pull request adds a rule that added the ability for Presto to handle the decorrelation of a subquery containing a LIMIT or (ORDER + LIMIT i.e. TopN) clauses. So, the common trick during decorrelation is to turn it into a query that can process the results from the inner table in one shot. The approach is to flatten the results of executing the subquery for every row into a single stream of rows before it is finally ready for execution.

This change also applies to a LATERAL join, which behaves a lot like a nested subquery only that it acts as a table and returns multiple rows instead of just a single row.

PR Demo: PR 1415 Decorrelate subqueries with Limit or TopN

SELECT (
   SELECT t.a 
   FROM (VALUES 1, 2, 3) t(a)
   WHERE t.a = t2.b
   LIMIT 2
) 
FROM (VALUES 1) t2(b);

SELECT (
   SELECT t.a 
   FROM (VALUES (1, 2), (1, 3)) t(a, b) 
   WHERE t.a = t2.a AND t2.b > 1 
   LIMIT 1
) FROM (VALUES (1, 2)) t2(a, b);

SELECT (
   SELECT t.b 
   FROM (VALUES (1, 2), (1, 3)) t(a, b) 
   WHERE t.a = t2.a AND t2.b > 1 
   ORDER BY t.b 
   LIMIT 1
) FROM (VALUES (1, 2)) t2(a, b)

### Fails
# 1) Returns more than one row from subquery.
# This query actually fails on execution on not during planning/optimizing where
# the other two below fail .
SELECT (
   SELECT t.a 
   FROM (VALUES 1, 1, 2, 3) t(a)
   WHERE t.a = t2.b
   LIMIT 2
) 
FROM (VALUES 1) t2(b);

# 2) Limit and correlated non-equality predicate in the subquery
SELECT (
   SELECT t.b 
   FROM (VALUES (1, 2), (1, 3)) t(a, b)
   WHERE t.a = t2.a AND t.b > t2.b 
   LIMIT 1
) FROM (VALUES (1, 2)) t2(a, b);

# 3) TopN and correlated non-equality predicate in the subquery
SELECT (
   SELECT t.b 
   FROM (VALUES (1, 2), (1, 3)) t(a, b) 
   WHERE t.a = t2.a AND t.b > t2.b 
   ORDER BY t.b 
   LIMIT 1
) FROM (VALUES (1, 2)) t2(a, b)

After the show Kasia pointed out that the failing queries were not all failing for the same reason. The first failing query above actually gets planned and executed, but the exception occurs during the execution. The rest actually fail during the planning and optimization phase as they were unable to be decorrelated due to the issue I line out in the comments above.

Question of the week:

In this week’s question, we answer: Will running Presto on my relational database make processing faster?

I have been going over the docs of PrestoSQL and it seems to fit some of my requirements. I am little concerned about the resources needed to run Presto in production. Because the size of my prod data is between 3-5GB and there will be very minimal data growth. Is Presto suitable for such a small data size?

Many times, the idea that Presto is fast gets conflated with the idea that Presto is a good fit for all use cases. It is important to understand that Presto is a) not a database b) not developed for OLTP workloads and c) built to handle data at the scale of Terabytes to Petabytes over distributed queries. Since Presto uses a connector framework, it also has an added benefit of running federated queries to whatever data source that returns data that can be represented in some columnar fashion.

For relatively small size data sets you should try directly using your relational database first. Doing this is better for small data sets. Database indexes are really nice if you’re not in big data world and if you give your SQL Server say 10 GB memory, it should be running fully in-memory and thus — fast.

Events, news, and various links

Blogs

Upcoming events

Dec 16 https://www.evanta.com/cdo/boston/2020-boston-cdo-virtual-executive-summit
Feb 9 - Feb 10 http://starburstdata.com/datanova

Latest training from David, Dain, and Martin(Now with timestamps!):

Presto Summit Series - Real world usage

Podcasts:

If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

6: Query Planning, Remove duplicate predicates, and Memory settings

2020-11-30T00:00:00+00:00

Release 347

We discuss the Trino 347 release notes: https://trino.io/docs/current/release/release-347.html

Official release announcement from Martin Traverso:

We’re happy to announce the release of Presto 347! This version includes:

Support for EXCEPT ALL and INTERSECT ALL
New syntax for changing the owner of a view
Performance improvements when inserting data into Hive tables

Notes from Manfred:

contains_sequence function for arrays.
CentOS 8 on docker image.
Kudu get dynamic filtering.

Concept of the week: Query planning

All happening on coordinator in cluster.
Before a query can be planned, the coordinator receives a SQL query and passes it to a parser.

Parser/Analyzer

The Parser parses the sql query into an AST (abstract syntax tree).
Then the analyzer checks for valid SQL including functions and such.

Planner/Optimizer

Request metadata about structure from catalogs.
- Do the tables and columns exist?
- What data types are used?
Request metadata about content (table stats, data location).
Create logical plan
- Are function parameters using right data types?
- What catalogs/schema/tables/columns need to be accessed?
- Are joins using compatible field data types?
- Optimize
  - Eliminate redundant conditions.
  - Figure best order of operations.
  - Decide on filtering early.
Create distributed plan (More on this in the next episode!)
- Break logical plan up.
- Adapt to parallel access by multiple workers to data source.
- Break up operations so workers aggregate and process data from other workers.

Use EXPLAIN to learn what is planned. Also refer to chapter 4 in Trino: The Definitive Guide pg. 50.

PR of the week: PR 730 Remove duplicate predicates

In this week’s pull request https://github.com/trinodb/trino/pull/730, came from one of the co-creators Martin Traverso. This pull request removes duplicate predicates in logical binary expressions (AND, OR) and canonicalizes commutative arithmetic expressions and comparisons to handle a larger number of variants. Canonicalize is a big word but all it is saying is that if there are multiple representations of the same logic or data, then simplify it to a simpler or agreed upon normal form.

For example the statement COALESCE(a * (2 * 3), 1 - 1) is equivalent to COALESCE(6 * a, 0) as the expression 2 * 3 can be simplified to static integer.

This is an example of a logical plan because we are talking about the query syntax by optimizing the SQL. It differs from the distributed plan as we are not determining how the plan will be distributed, where this plan will run and it does not run further optimizations that are handled by the cost based optimizer such as pushdown predicates. We’ll talk about this step more in the next episode. For now let’s cover a few examples

Demo: PR 730 Remove duplicate predicates

The format of the EXPLAIN used is graphviz. The online tool used during the show is Viz.js. You can paste the output of your EXPLAIN queries to visualize the query in a tree form.

EXPLAIN (
 FORMAT GRAPHVIZ,
 TYPE LOGICAL
 )
SELECT * FROM (VALUES 1) t(a) WHERE a = 1 OR 1 = a OR a = 1;

EXPLAIN (
 FORMAT GRAPHVIZ,
 TYPE LOGICAL
 )
SELECT * FROM (VALUES 1) t(a) WHERE a = 2 OR 1 = a OR a = 3; 

EXPLAIN (
 FORMAT GRAPHVIZ,
 TYPE DISTRIBUTED
 )
SELECT * FROM tpch.tiny.orders
WHERE custkey > 100 and custkey > 50 and custkey > 50 and custkey > 50 and custkey > 50;  

SELECT * 
FROM tpch.tiny.orders o 
  JOIN tpch.tiny.customer c 
  ON o.custkey = c.custkey AND o.custkey > 50 
WHERE c.custkey > 100 AND c.custkey > 50 LIMIT 10;

Question of the week: How should I allocate memory properties?

In this week’s question, we answer:

How should I allocate memory properties? CPU : 16Core MEM:64GB

Before answering this, we should make sure a few things about memory are clear.

User memory

Space needed that the user is capable of reasoning about:

Input Data
Hash tables execution
Sorting

Settings

query.max-memory-per-node - maximum amount of user memory that a query is allowed to use on a given worker.
query.max-memory (without the -per-node at the end) - This config caps the amount of user memory used by a single query over all worker nodes in your cluster.

System memory

Memory needed to facilitate internal usage

Shuffle buffers

NOTE: There are no settings for this memory as it is implicitly set by the user and total memory settings. Use this to calculate system memory:

max system memroy per node = query.max-total-memory-per-node - query.max-memory-per-node
max system memory = query.max-total-memory - query.max-memory

Total memory

Total Memory = System + User, but there are only properties for total and user memory.

Settings

query.max-total-memory-per-node - maximum amount of total memory that a query is allowed to use on a given worker.
query.max-total-memory(without the -per-node at the end) - This config caps the total memory used by a single query over all worker nodes in your cluster.

Heap headroom

The final setting I would like to cover is the memory.heap-headroom-per-node. This config sets aside memory for the JVM heap for allocations that are not tracked by Presto. You can typically go with the default on this setting which is 30% of the JVM’s max heap size (-Xmx setting).

JVM heap memory (-Xmx setting)

Now knowing that Presto is a java application means it runs on the JVM. None of these memory settings mean anything until we actually have the JVM that Presto is running on set aside sufficient memory. So how do I know I am setting sufficient memory based on my settings?

query.max-total-memory-per-node + memory.heap-headroom-per-node < -Xmx setting (Java heap)

Dain really covers the proportions well in detail on the recent training videos. Here’s a snippet of what he recommends.

All in all, try to estimate the amount of memory needed by your max anticipated query load, and if possible try to get even more than your estimate. Once Presto is discovered by users, they will start to use it even more and demands on the system will grow.

Events, news, and various links

Blogs

Upcoming events

Latest training from David, Dain, and Martin(Now with timestamps!):

Presto Summit Series - Real world usage

Podcasts:

If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

5: Hive Partitions, sync_partition_metadata, and Query Exceeded Max Columns!

2020-11-19T00:00:00+00:00

In this week’s concept, Manfred discusses Hive Partitioning.

Concept from RDBMS systems implemented in HDFS
Normally just multiple files in a directory per table
Lots of different file formats, but always one directory
Partitioning creates nested directories
Needs to be set up at start of table creation
CTAS query
Uses WITH ( partitioned_by = ARRAY[‘date’])
Results in tablename/date=2020-11-19
Can also nest deeper WITH ( partitioned_by = ARRAY[‘date’, ‘countrycode’])
Can greatly enhance performance
Optimizer can determine what directories to read based on field
Especially useful when fields are used in WHERE clauses
Also useful for historic data management over time such as moving data out to archive, deleting data, or replacing data with aggregates, or even just running compaction on subsets
Presto can use DELETE on partitions using DELTE FROM table WHERE date=value
Also possible to create empty partitions upfront CALL system.create_empty_partition

See here for more details: https://www.educba.com/partitioning-in-hive/

In this week’s pull request https://github.com/trinodb/trino/pull/223, came from contributor Hao Luo. What this function does is similar to Hive’s MSCK REPAIR TABLE where if it finds a hive partition directory in the filesystem that exist but no partition entry in the metastore, then it will add the entry to the metastore. If there is an entry in the metastore but the partition was deleted from the filesystem, then it will remove the metastore entry. You can find more information about this procedure in the documentation.

Here are the commands and SQL I ran during the show on Presto

SHOW CATALOGS;

SHOW SCHEMAS in minio;
SHOW TABLES IN minio.part;

CREATE SCHEMA minio.part
WITH (location = 's3a://part/');

-- Create a table with no partitions
CREATE TABLE minio.part.no_part (id int, name varchar, dt varchar)
WITH (
  format = 'ORC'
);

INSERT INTO minio.part.no_part 
VALUES 
  (1, 'part-1', '2020-11-18'), 
  (2, 'part-2', '2020-11-18'),
  (3, 'part-3', '2020-11-19'), 
  (4, 'part-4', '2020-11-19'),
  (5, 'part-5', '2020-11-20'), 
  (6, 'part-6', '2020-11-20');

CREATE TABLE minio.part.orders (id int, name varchar, dt varchar)
WITH (
  format = 'ORC',
  partitioned_by = ARRAY['dt']
);

INSERT INTO minio.part.orders 
VALUES 
  (1, 'part-1', '2020-11-18'), 
  (2, 'part-2', '2020-11-18'),
  (3, 'part-3', '2020-11-19'), 
  (4, 'part-4', '2020-11-19'),
  (5, 'part-5', '2020-11-20'), 
  (6, 'part-6', '2020-11-20');

SELECT *
FROM minio.part.no_part
WHERE dt = '2020-11-20';
 
SELECT *
FROM minio.part.orders
WHERE dt = '2020-11-20';

DELETE FROM minio.part.orders 
WHERE dt = '2020-11-18';


SELECT *
FROM minio.part.orders;

-- Make sure you are using minio (which is a rename of hive) catalog
CALL system.sync_partition_metadata('part', 'orders', 'ADD');
CALL system.sync_partition_metadata('part', 'orders', 'DROP');
CALL system.sync_partition_metadata('part', 'orders', 'FULL');

 -- Create a table with multi partitions
CREATE TABLE minio.part.multi_part (id int, name varchar, year varchar, month varchar, day varchar)
WITH (
  format = 'ORC',
  partitioned_by = ARRAY['year', 'month', 'day']
);

INSERT INTO minio.part.multi_part 
VALUES 
  (1, 'part-1', '2020', '11', '18'), 
  (2, 'part-2', '2020', '11', '18'),
  (3, 'part-3', '2020', '11', '19'), 
  (4, 'part-4', '2020', '11', '19'),
  (5, 'part-5', '2020', '11', '20'), 
  (6, 'part-6', '2020', '11', '20'),
  (7, 'part-7', '2019', '11', '18'), 
  (8, 'part-8', '2019', '01', '18'),
  (9, 'part-9', '2019', '11', '19'), 
  (10, 'part-10', '2019', '01', '19'),
  (11, 'part-11', '2019', '11', '20'), 
  (12, 'part-12', '2019', '01', '20');

We ran some queries against the metastore database. It’s a complicated model so here is a database diagram to show the different tables and their relations in the metastore.

This diagram was generated by niftimusmaximus on The Analytics Anvil.

MariaDB (metastore database)

USE metastore_db;

-- show database
SELECT * FROM DBS;

-- show tables given a database
SELECT t.*
FROM DBS d
 JOIN TBLS t ON d.DB_ID = t.DB_ID
WHERE d.NAME = 'part';

-- show location and input format of the table given database/table names
SELECT s.SD_ID, s.INPUT_FORMAT, s.LOCATION, s.SERDE_ID 
FROM DBS d
 JOIN TBLS t ON d.DB_ID = t.DB_ID
 JOIN SDS s ON t.SD_ID = s.SD_ID
WHERE t.TBL_NAME = 'orders' AND d.NAME='part';

-- show (de)serializer format of the table given database/table names
SELECT sd.SERDE_ID, sd.NAME, sd.SLIB
FROM DBS d
 JOIN TBLS t ON d.DB_ID = t.DB_ID
 JOIN SDS s ON t.SD_ID = s.SD_ID
 JOIN SERDES sd ON s.SERDE_ID = sd.SERDE_ID
WHERE t.TBL_NAME = 'orders' AND d.NAME='part';

-- show columns of the table given database/table names
SELECT c.* 
FROM DBS d
 JOIN TBLS t ON d.DB_ID = t.DB_ID
 JOIN SDS s ON t.SD_ID = s.SD_ID
 JOIN COLUMNS_V2 c ON s.CD_ID = c.CD_ID
WHERE t.TBL_NAME = 'orders' AND d.NAME='part'
ORDER by CD_ID, INTEGER_IDX;

-- show partitions of the table given database/table names
SELECT p.*, s.LOCATION
FROM DBS d
 JOIN TBLS t ON d.DB_ID = t.DB_ID
 JOIN PARTITIONS p ON t.TBL_ID = p.TBL_ID
 JOIN SDS s ON p.SD_ID = s.SD_ID
WHERE t.TBL_NAME = 'orders' AND d.NAME='part';

In this week’s question, we answer:

Why am I getting, “Query exceeded maximum columns. Please reduce the number of columns referenced and re-run the query.”?

Example:

I’m running this query to check for duplicates. My table has approx. 650 columns and I get this error.

SELECT *, COUNT(1) 
FROM tbl 
GROUP BY * 
HAVING COUNT(1) > 1

getting a stacktrace like this

io.prestosql.spi.PrestoException: Compiler failed
	at io.prestosql.sql.planner.LocalExecutionPlanner$Visitor.visitScanFilterAndProject(LocalExecutionPlanner.java:1306)
	at io.prestosql.sql.planner.LocalExecutionPlanner$Visitor.visitProject(LocalExecutionPlanner.java:1185)
	at io.prestosql.sql.planner.LocalExecutionPlanner$Visitor.visitProject(LocalExecutionPlanner.java:705)
	at io.prestosql.sql.planner.plan.ProjectNode.accept(ProjectNode.java:82)
	at io.prestosql.sql.planner.LocalExecutionPlanner$Visitor.visitAggregation(LocalExecutionPlanner.java:1119)
	at io.prestosql.sql.planner.LocalExecutionPlanner$Visitor.visitAggregation(LocalExecutionPlanner.java:705)
	at io.prestosql.sql.planner.plan.AggregationNode.accept(AggregationNode.java:204)
	at io.prestosql.sql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:461)
	at io.prestosql.sql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:432)
	at io.prestosql.execution.SqlTaskExecutionFactory.create(SqlTaskExecutionFactory.java:75)
	at io.prestosql.execution.SqlTask.updateTask(SqlTask.java:382)
	at io.prestosql.execution.SqlTaskManager.updateTask(SqlTaskManager.java:383)
	at io.prestosql.server.TaskResource.createOrUpdateTask(TaskResource.java:128)
	at jdk.internal.reflect.GeneratedMethodAccessor480.invoke(Unknown Source)

The throwable that causes this error MethodTooLargeException comes from the ASM library https://asm.ow2.io/ when you ask it to create a method with more bytecode than is allowed by the JVM specification.

We try to generate code for handling given query and the code generated is too large. Since the code is proportional to number of columns referenced, we rewrap the exception in something more meaningful to the user.

The general strategy would be to lower the number of columns that you reference.

The problem is that in removing columns you will remove important information to the query. For example, in the example looking for duplicates above, you won’t be able to discard false positive duplicate matches, but this may be good enough to help narrow the search space. As always, it depends…

To learn more about the JVM limit and search for code_length in the Java SE specification.

SE8
SE11

Special thanks to Ashhar Hasan for asking this question and providing some useful context!

Release Notes discussed: https://trino.io/docs/current/release/release-346.html

Manfred’s Training - SQL at any scale https://www.simpligility.com/2020/10/join-me-for-presto-first-steps/ https://learning.oreilly.com/live-training/courses/presto-first-steps/0636920462859/

Blogs

Upcoming events

Nov 19 Presto Tokyo Conference - Japanese https://techplay.jp/event/795265
Nov 24 EMEA - Polish https://www.meetup.com/Warsaw-Data-Engineering/events/274666392/
Dec 2 https://www.evanta.com/cdo/atlanta/2020-atlanta-cdo-virtual-executive-summit
Dec 3 EMEA https://www.starburstdata.com/introduction-to-presto/
Dec 9 https://techtalksummits.com/event/virtual-commercial-it-providence-ri/
Dec 10 https://techtalksummits.com/event/virtual-commercial-it-denver-co/
Dec 10 https://www.evanta.com/cdo/san-francisco/2020-san-francisco-cdo-virtual-executive-summit
Dec 16 https://www.evanta.com/cdo/boston/2020-boston-cdo-virtual-executive-summit

Latest training from David, Dain, and Martin(Now with timestamps!):

Presto Summit Series - Real world usage

Recent Podcasts:

If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

4: Presto on ACID, row-level INSERT/DELETE, and why JDK11?

2020-11-04T00:00:00+00:00

In this week’s concept, Manfred discusses ACID in general, CAP theorem, HDFS and Hive before ACID, and now ORC ACID and similar support.

ACID https://en.wikipedia.org/wiki/ACID

Atomicity - Transaction completely succeeds or completely fails, no partial results so no inconsistent relationships left tangling and such. The database remains in a consistent state.
Consistency - database content always adheres to defined rules (key constraints).
Isolation, transactions are isolated from each other and can run in parallel with same result as sequentially.
Durability - no data is lost after transaction completion.

ACID used to be a crucial criteria for a “serious” relational database system.

Then came big data and the CAP theorem. https://en.wikipedia.org/wiki/CAP_theorem

Consistency
Availability
Partition tolerance

In this week’s pull request https://github.com/trinodb/trino/pull/5402, came from contributor David Stryker. David covers some interesting aspects to working on this pull request. This commit adds support for row-level insert and delete for Hive ACID tables, and product tests that verify that row-level insert and delete where allowed.

Here is the SQL that we ran in the INSERT/DELETE demo

/*
  Ran against Presto
*/
SHOW SCHEMAS IN minio;
SHOW TABLES IN minio.acid;

CREATE SCHEMA minio.acid
WITH (location = 's3a://acid/');


CREATE TABLE minio.acid.test (a int, b int)
WITH (
   format='ORC',
   transactional=true
);

INSERT INTO minio.acid.test VALUES (10, 10), (20, 20);

SELECT * FROM  minio.acid.test;

DELETE FROM minio.acid.test WHERE a = 10;

/*
  Ran against Hive
*/

SHOW DATABASES;

SELECT * FROM acid.test;

David also mentioned this blog to better understand the hive acid model.

In this week’s question we answer, “Why is Java 11 needed in the newer version of Presto and how do I get the older version of Presto as I need the 328 latest on Java 8 as Java 11 isn’t available to use?

Using Java 11 because it is the next LTS verison of java since 8. Java 11 provides significant performance and stability improvements, so we believe everyone should be running that version to get the best experience out of Presto. Moving to Java 11 allows us to take advantage of many improvements to the JDK and the Java language that were introduced since Java 8.

For older versions, you can download it from maven or an older document version. https://repo.maven.apache.org/maven2/io/prestosql/presto-server/ https://trino.io/docs/328/

One thing to point out is you’re only required to use JDK11 for the server. The client can be on JDK8. One reason you would need to run Presto on JDK8 is if the server had to be run with another service running JDK8 which we do not recommend as this will degrade the performance of your cluster and could cause other issues if Presto is fighting for resources.

Another possibility is that there is a company policy requiring specific JDKs be installed on all servers. You can have side-by-side installs of multiple versions of the JDK and use the appropriate one. You just need to launch Presto with the correct java command. If your company is against using a newer JDK, you can point out the arguments above to update the policy to at least include JDK11.

Release Notes discussed: https://trino.io/docs/current/release/release-345.html

Manfred’s Training - SQL at any scale https://www.simpligility.com/2020/10/join-me-for-presto-first-steps/ https://learning.oreilly.com/live-training/courses/presto-first-steps/0636920462859/

Blogs

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-creating-a-single-point-of-access-to-multiple-postgres-servers-using-starburst-presto

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-unlock-data-in-postgres-servers-to-query-it-with-other-data-sources-like-hive-kafka-other-dbmss-and-more

https://blog.bigdataboutique.com/2020/09/presto-meets-elasticsearch-our-elasticsearch-connector-for-presto-video-mbywtm

Upcoming events

Latest training from David, Dain, and Martin(Now with timestamps!):

Presto Summit Series - Real world usage

Recent Podcasts:

If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

3: Running two Presto distributions and Kafka headers as Presto columns

2020-10-22T00:00:00+00:00

In this week’s concept, Manfred discusses what an SPI (service provider interface) is and covers the connector architecture of Presto, Starburst, and Custom.

In this week’s pull request https://github.com/trinodb/trino/pull/4462, came from user Sven Pfennig. Sven works for Syncier GmbH and as part of his role there he gets to contribute to open source projects such as Presto. Thanks Sven! We jump into a quick setup of a kafka broker using the kafka quickstart tutorial and I use the kafkacat tool to show off the addition of headers in Kafka that Sven has provided us and discuss why this is beneficial.

Here’s the crazy select statement I used to decode the binary values to utf text of the foo column

SELECT 
   _message, 
   reduce(element_at(_headers,'foo'), '', (s, c) -> s || from_utf8(c), s -> s) AS foo 
FROM kafka.default.pcb 
WHERE contains(map_keys(_headers), 'foo');

An alternative tutorial that uses the TPC dataset can be located on the website site. https://trino.io/docs/current/connector/kafka-tutorial.html

This weeks question was accidentally cut off as I had mapped my Shift + R key to toggle streaming/recording and this cut the broadcast when I typed the R in FROM.

Release Notes discussed: https://trino.io/docs/current/release/release-344.html

Manfred’s Training - SQL at any scale https://www.simpligility.com/2020/10/join-me-for-presto-first-steps/ https://learning.oreilly.com/live-training/courses/presto-first-steps/0636920462859/

Blogs

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-creating-a-single-point-of-access-to-multiple-postgres-servers-using-starburst-presto

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-unlock-data-in-postgres-servers-to-query-it-with-other-data-sources-like-hive-kafka-other-dbmss-and-more

https://blog.bigdataboutique.com/2020/09/presto-meets-elasticsearch-our-elasticsearch-connector-for-presto-video-mbywtm

Upcoming events

Latest training from David, Dain, and Martin: https://trino.io/blog/2020/07/15/training-advanced-sql.html https://trino.io/blog/2020/07/30/training-query-tuning.html https://trino.io/blog/2020/08/13/training-security.html https://trino.io/blog/2020/08/27/training-performance.html

Presto Summit Series - Real world usage https://trino.io/blog/2020/05/15/state-of-presto.html https://trino.io/blog/2020/06/16/presto-summit-zuora.html https://trino.io/blog/2020/07/06/presto-summit-arm-td.html https://trino.io/blog/2020/07/22/presto-summit-pinterest.html

Recent Podcasts: https://www.contributor.fyi/presto https://www.dataengineeringpodcast.com/presto-distributed-sql-episode-149/

If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

2: Kubernetes, arrays on Elasticsearch, and security breaks the UI

2020-10-07T00:00:00+00:00

This week we had a bit of a technical issue between zoom and OBS so there was some editing done to remove a portion of the broadcast which mainly cuts out us covering the releases. We circle back and give a small summary but unfortunately lost the majority of that part of the conversation.

In this week’s concept, we cover a general overview of kubernetes and how kubernetes is used when deploying and scaling up . We also dive into how this is being used at our guest Cory Darby’s company, BlueCat.

In this week’s pull request covers a pull request https://github.com/trinodb/trino/pull/2462 which closes ticket https://github.com/trinodb/trino/issues/2441. This was actually a PR Brian submitted some months ago. He dives into a bit about Elasticsearch mappings and how Elasticsearch models their data. He then covers how this motivated the pull request addressing the need for explicit mappings of which Elasticsearch fields are array types vs scalar.

In this week’s question, we answer, “Why does the web ui say “disabled”?” This typically comes from a security setup issue and there’s another similar issue when you are using a proxy that we cover as a bonus.

Release Notes discussed: https://trino.io/docs/current/release/release-342.html https://trino.io/docs/current/release/release-343.html

Manfred’s Training - SQL at any scale https://www.simpligility.com/2020/10/join-me-for-presto-first-steps/ <https://learning.oreilly.com/live-training/courses/presto-first-steps /0636920462859/>

Blogs

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-creating-a-single-point-of-access-to-multiple-postgres-servers-using-starburst-presto

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-unlock-data-in-postgres-servers-to-query-it-with-other-data-sources-like-hive-kafka-other-dbmss-and-more

https://medium.com/@joshua_robinson/presto-and-fast-object-putting-backups-to-use-for-devops-and-machine-learning-s3-46876eef4ffa

Upcoming events

Recent Podcasts: https://www.contributor.fyi/presto https://www.dataengineeringpodcast.com/presto-distributed-sql-episode-149/

If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.

1: What is Presto, WITH RECURSIVE, and Hive connector

2020-09-24T00:00:00+00:00

Today’s concept covers a big overview of what Presto is for those that are new to Presto. For mor information about Presto, check out the following resources: Website Documentation Download the Free Presto O’Reilly Book Learn how to contribute Join our community on the Slack channel

In this PR we covered pull request 5163 which is actually just a documentation update around the existing experimental features of the WITH RECURSIVE feature. The extended development of this feature is still being tracked and documented in issue 1122. As with many problems in recursion, the solution space typically exponentially increases and so it is something that can easily be misused and cause problems. We run the query and discuss it as well as some of the things that can go wrong. Check out he pull request to see more documentation that was added around it.

In the question of the week, we covered a lot of the confusion around the hive connector(https://trino.io/docs/current/connector/hive.html). Feel free to try out the katacoda example I created and will be nesting within an intro to the hive connector blog. This is running on a non-paid katacoda account so resources are scarce at times and it may take a while to load. Nevertheless, the information written around it will help you quickly have a Presto environment to play with.

Release Notes discussed: https://trino.io/docs/current/release/release-341.html

Upcoming events

Recent Podcasts: https://www.contributor.fyi/presto https://www.dataengineeringpodcast.com/presto-distributed-sql-episode-149/

If you want to learn more about Presto yourself, you should check out the O’Reilly Trino Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.