Cassandra

Navigating the NoSQL Landscape: MongoDB vs. Cassandra for nested or complex JSON data handling

The choice between MongoDB and Cassandra becomes crucial when dealing with nested or complex JSON objects. MongoDB and Cassandra offer different approaches due to their underlying data models and architectures.

MongoDB

Navigating the NoSQL Landscape: MongoDB vs. Cassandra for nested or complex JSON data handling

The choice between MongoDB and Cassandra becomes crucial when dealing with nested or complex JSON objects. MongoDB and Cassandra offer different approaches due to their underlying data models and architectures.

Nested-JSON

Navigating the NoSQL Landscape: MongoDB vs. Cassandra for nested or complex JSON data handling

The choice between MongoDB and Cassandra becomes crucial when dealing with nested or complex JSON objects. MongoDB and Cassandra offer different approaches due to their underlying data models and architectures.

NoSQL

Navigating the NoSQL Landscape: MongoDB vs. Cassandra for nested or complex JSON data handling

The choice between MongoDB and Cassandra becomes crucial when dealing with nested or complex JSON objects. MongoDB and Cassandra offer different approaches due to their underlying data models and architectures.

NoSQL-DB

Navigating the NoSQL Landscape: MongoDB vs. Cassandra for nested or complex JSON data handling

The choice between MongoDB and Cassandra becomes crucial when dealing with nested or complex JSON objects. MongoDB and Cassandra offer different approaches due to their underlying data models and architectures.

access-modifiers

airflow

How to set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on the DAG execution date, not the task start time.

algorithms

An Introduction to Algorithms and Data Structures

An algorithm is a series of instructions in a particular order for performing a specific task.

algorithms-and-data-structures

An Introduction to Algorithms and Data Structures

An algorithm is a series of instructions in a particular order for performing a specific task.

amazon-emr

Overview of Amazon EMR

Amazon EMR is a managed cluster platform that makes it easier to run big data frameworks like Apache Hadoop and Apache Spark on AWS to process and analyze huge amounts of data.

anti-pattern

Anti-Pattern

Anti-patterns at first seem to be quick and reasonable, they typically have adverse effects in the future. They are design and code smells. It affects our software badly and adds technical debt. We should avoid them at all costs.

apache spark

apache-pinot

Apache Pinot joins hands with Kafka and Presto to provide low-latency, high-throughput user-facing analytics

Apache Pinot is a real-time, distributed OLAP datastore that was built for low-latency, high-throughput analytics, making it perfect for user-facing analytical workloads. Pinot joins hands with Kafka and Presto to provide user-facing analytics.

apache-spark

apache-yarn

apm

application-performance-monitoring

async

asynchronous

asynchronouse-programming

aws

AWS Command Line Interface (AWS CLI)

AWS CLI is an open-source tool that allows us to interact with AWS services using command-line shell commands.

Overview of Amazon EMR

Amazon EMR is a managed cluster platform that makes it easier to run big data frameworks like Apache Hadoop and Apache Spark on AWS to process and analyze huge amounts of data.

aws-cli

AWS Command Line Interface (AWS CLI)

AWS CLI is an open-source tool that allows us to interact with AWS services using command-line shell commands.

aws-glue

big-data

Data Governance

Data governance is the process of defining security guidelines and policies and making sure they are followed by having authority and control over how data assets are managed.

Data Catalog

A data catalog is an ordered inventory of an organization's data assets that makes it easy to find the most relevant data quickly.

bucketing

cache

coding-principles

Singleton Pattern

The singleton pattern ensures controlled access to a single instance of a class. While it offers significant benefits in terms of resource management and access control, developers must be mindful of its downsides, such as potential scalability issues and the introduction of global states. When used carefully, it can be...

coding-problem

coding-problem-solving

columnar-format

columnar-storage

container-management

container-orchestration

data

data-as-a-product

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as a product, the data itself is seen as the actual product.

data-caching

data-catalog

Data Catalog

A data catalog is an ordered inventory of an organization's data assets that makes it easy to find the most relevant data quickly.

data-engineering

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as a product, the data itself is seen as the actual product.

Data Governance

Data governance is the process of defining security guidelines and policies and making sure they are followed by having authority and control over how data assets are managed.

Data Catalog

A data catalog is an ordered inventory of an organization's data assets that makes it easy to find the most relevant data quickly.

data-goverance

Data Governance

Data governance is the process of defining security guidelines and policies and making sure they are followed by having authority and control over how data assets are managed.

data-inventory

Data Catalog

A data catalog is an ordered inventory of an organization's data assets that makes it easy to find the most relevant data quickly.

data-key

data-lake

Data Catalog

A data catalog is an ordered inventory of an organization's data assets that makes it easy to find the most relevant data quickly.

data-management

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as a product, the data itself is seen as the actual product.

data-mesh

data-observability

data-pipeline

data-product

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as a product, the data itself is seen as the actual product.

data-protection

data-quality

data-science

data-security

Data Governance

Data governance is the process of defining security guidelines and policies and making sure they are followed by having authority and control over how data assets are managed.

data-streaming

data-structures

An Introduction to Algorithms and Data Structures

An algorithm is a series of instructions in a particular order for performing a specific task.

database

database-indexing

delta lake

design-patterns

Singleton Pattern

The singleton pattern ensures controlled access to a single instance of a class. While it offers significant benefits in terms of resource management and access control, developers must be mindful of its downsides, such as potential scalability issues and the introduction of global states. When used carefully, it can be...

Anti-Pattern

Anti-patterns at first seem to be quick and reasonable, they typically have adverse effects in the future. They are design and code smells. It affects our software badly and adds technical debt. We should avoid them at all costs.

distributed-system

elastic-apm

elasticsearch

envelope-encryption

etl

functional-programming

great-expectations

grpc

Introduction to gRPC

gRPC is an open-source, high-performance RPC framework that can run in any environment. gRPC builds on HTTP/2 protocol and the protobuf message-encoding protocol to provide high performance, low-bandwidth communication between applications and services.

gx

hadoop

iac

Terraform Basics

Terraform is an open source infrastructure-as-code tool that allows us to programmatically provision the physical resources required for an application to run.

index

infrastructure-as-code

Terraform Basics

Terraform is an open source infrastructure-as-code tool that allows us to programmatically provision the physical resources required for an application to run.

inter-process-communication

Introduction to gRPC

gRPC is an open-source, high-performance RPC framework that can run in any environment. gRPC builds on HTTP/2 protocol and the protobuf message-encoding protocol to provide high performance, low-bandwidth communication between applications and services.

inverted-index

kafka

Windowing in Kafka Streams

Windowing refers to the process of dividing a continuous stream of data into discrete segments, or windows, based on time. These windows then serve as the basis for applying computational operations, such as aggregations or transformations, to the data contained within them.

kafka-streams

Windowing in Kafka Streams

Windowing refers to the process of dividing a continuous stream of data into discrete segments, or windows, based on time. These windows then serve as the basis for applying computational operations, such as aggregations or transformations, to the data contained within them.

kibana

knowledge-graph

kubernetes

lakefs

lakehouse

memory-management

Rust’s Ownership and Borrowing Enforce Memory Safety

Rust's ownership and borrowing features prevent us from experiencing memory-related problems. Rust is a great choice when performance matters and it solves pain points that bother many other languages.

nginx

object-oriented-programming

olap

Apache Pinot joins hands with Kafka and Presto to provide low-latency, high-throughput user-facing analytics

Apache Pinot is a real-time, distributed OLAP datastore that was built for low-latency, high-throughput analytics, making it perfect for user-facing analytical workloads. Pinot joins hands with Kafka and Presto to provide user-facing analytics.

olap-datastore

Apache Pinot joins hands with Kafka and Presto to provide low-latency, high-throughput user-facing analytics

Apache Pinot is a real-time, distributed OLAP datastore that was built for low-latency, high-throughput analytics, making it perfect for user-facing analytical workloads. Pinot joins hands with Kafka and Presto to provide user-facing analytics.

oops

parquet

partition

pinot

Apache Pinot joins hands with Kafka and Presto to provide low-latency, high-throughput user-facing analytics

Apache Pinot is a real-time, distributed OLAP datastore that was built for low-latency, high-throughput analytics, making it perfect for user-facing analytical workloads. Pinot joins hands with Kafka and Presto to provide user-facing analytics.

postgres

postgresql

presto

prestodb

problem-solving

programming

Rust’s Ownership and Borrowing Enforce Memory Safety

Rust's ownership and borrowing features prevent us from experiencing memory-related problems. Rust is a great choice when performance matters and it solves pain points that bother many other languages.

remote-procedure-call

Introduction to gRPC

gRPC is an open-source, high-performance RPC framework that can run in any environment. gRPC builds on HTTP/2 protocol and the protobuf message-encoding protocol to provide high performance, low-bandwidth communication between applications and services.

resource-management

reverse-etl

root-key

rpc

Introduction to gRPC

gRPC is an open-source, high-performance RPC framework that can run in any environment. gRPC builds on HTTP/2 protocol and the protobuf message-encoding protocol to provide high performance, low-bandwidth communication between applications and services.

rust

Rust’s Ownership and Borrowing Enforce Memory Safety

Rust's ownership and borrowing features prevent us from experiencing memory-related problems. Rust is a great choice when performance matters and it solves pain points that bother many other languages.

scala

scala-async

scala-collections

sdlc

What does 'yanked' release mean?

'Released' and 'yanked' are terms used in software development to indicate the state of a software package or library. These terms specify whether a given package version is suitable for usage or need to be avoided.

service-level-agreement

How to set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on the DAG execution date, not the task start time.

shuffling

singleton-pattern

Singleton Pattern

The singleton pattern ensures controlled access to a single instance of a class. While it offers significant benefits in terms of resource management and access control, developers must be mindful of its downsides, such as potential scalability issues and the introduction of global states. When used carefully, it can be...

sla

How to set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on the DAG execution date, not the task start time.

solid

sql

terraform

Terraform Basics

Terraform is an open source infrastructure-as-code tool that allows us to programmatically provision the physical resources required for an application to run.

version control

What does 'yanked' release mean?

'Released' and 'yanked' are terms used in software development to indicate the state of a software package or library. These terms specify whether a given package version is suitable for usage or need to be avoided.

web-server

windowing

Windowing in Kafka Streams

Windowing refers to the process of dividing a continuous stream of data into discrete segments, or windows, based on time. These windows then serve as the basis for applying computational operations, such as aggregations or transformations, to the data contained within them.

workflow-engine

How to set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on the DAG execution date, not the task start time.

yanked

What does 'yanked' release mean?

'Released' and 'yanked' are terms used in software development to indicate the state of a software package or library. These terms specify whether a given package version is suitable for usage or need to be avoided.

yanked-release

What does 'yanked' release mean?

'Released' and 'yanked' are terms used in software development to indicate the state of a software package or library. These terms specify whether a given package version is suitable for usage or need to be avoided.