Home 👨‍💻 IndieHack Github Trending - Weekly
author

Github Trending - Weekly

Github weekly trending

December 31, 1969  23:59:59

Zigbee 🐝 to MQTT bridge 🌉, get rid of your proprietary Zigbee bridges 🔨




Zigbee2MQTT 🌉 🐝

Allows you to use your Zigbee devices without the vendor's bridge or gateway.

It bridges events and allows you to control your Zigbee devices via MQTT. In this way you can integrate your Zigbee devices with whatever smart home infrastructure you are using.

Getting started

The documentation provides you all the information needed to get up and running! Make sure you don't skip sections if this is your first visit, as there might be important details in there for you.

If you aren't familiar with Zigbee terminology make sure you read this to help you out.

Integrations

Zigbee2MQTT integrates well with (almost) every home automation solution because it uses MQTT. However the following integrations are worth mentioning:

Home Assistant


Homey


Domoticz


Gladys Assistant

  • Integration implemented natively in Gladys Assistant (documentation).

IoBroker


Architecture

Architecture

Internal Architecture

Zigbee2MQTT is made up of three modules, each developed in its own Github project. Starting from the hardware (adapter) and moving up; zigbee-herdsman connects to your Zigbee adapter and makes an API available to the higher levels of the stack. For e.g. Texas Instruments hardware, zigbee-herdsman uses the TI zStack monitoring and test API to communicate with the adapter. Zigbee-herdsman handles the core Zigbee communication. The module zigbee-herdsman-converters handles the mapping from individual device models to the Zigbee clusters they support. Zigbee clusters are the layers of the Zigbee protocol on top of the base protocol that define things like how lights, sensors and switches talk to each other over the Zigbee network. Finally, the Zigbee2MQTT module drives zigbee-herdsman and maps the zigbee messages to MQTT messages. Zigbee2MQTT also keeps track of the state of the system. It uses a database.db file to store this state; a text file with a JSON database of connected devices and their capabilities. Zigbee2MQTT provides a web-based interface that allows monitoring and configuration.

Developing

Zigbee2MQTT uses TypeScript (partially for now). Therefore after making changes to files in the lib/ directory you need to recompile Zigbee2MQTT. This can be done by executing npm run build. For faster development instead of running npm run build you can run npm run build-watch in another terminal session, this will recompile as you change files.

Supported devices

See Supported devices to check whether your device is supported. There is quite an extensive list, including devices from vendors like Xiaomi, Ikea, Philips, OSRAM and more.

If it's not listed in Supported devices, support can be added (fairly) easily, see How to support new devices.

Support & help

If you need assistance you can check opened issues. Feel free to help with Pull Requests when you were able to fix things or add new devices or just share the love on social media.

December 31, 1969  23:59:59

High performance self-hosted photo and video management solution.



License: AGPLv3 Discord

High performance self-hosted photo and video management solution



Català Español Français Italiano 日本語 한국어 Deutsch Nederlands Türkçe 中文 Русский Português Brasileiro Svenska العربية Tiếng Việt

Disclaimer

  • ⚠️ The project is under very active development.
  • ⚠️ Expect bugs and breaking changes.
  • ⚠️ Do not use the app as the only way to store your photos and videos.
  • ⚠️ Always follow 3-2-1 backup plan for your precious photos and videos!

[!NOTE] You can find the main documentation, including installation guides, at https://immich.app/.

Links

Demo

Access the demo here. The demo is running on a Free-tier Oracle VM in Amsterdam with a 2.4Ghz quad-core ARM64 CPU and 24GB RAM.

For the mobile app, you can use https://demo.immich.app/api for the Server Endpoint URL

Login credentials

Email Password
[email protected] demo

Features

Features Mobile Web
Upload and view videos and photos Yes Yes
Auto backup when the app is opened Yes N/A
Prevent duplication of assets Yes Yes
Selective album(s) for backup Yes N/A
Download photos and videos to local device Yes Yes
Multi-user support Yes Yes
Album and Shared albums Yes Yes
Scrubbable/draggable scrollbar Yes Yes
Support raw formats Yes Yes
Metadata view (EXIF, map) Yes Yes
Search by metadata, objects, faces, and CLIP Yes Yes
Administrative functions (user management) No Yes
Background backup Yes N/A
Virtual scroll Yes Yes
OAuth support Yes Yes
API Keys N/A Yes
LivePhoto/MotionPhoto backup and playback Yes Yes
Support 360 degree image display No Yes
User-defined storage structure Yes Yes
Public Sharing Yes Yes
Archive and Favorites Yes Yes
Global Map Yes Yes
Partner Sharing Yes Yes
Facial recognition and clustering Yes Yes
Memories (x years ago) Yes Yes
Offline support Yes No
Read-only gallery Yes Yes
Stacked Photos Yes Yes

Translations

Read more about translations here.

Translation status

Repository activity

Activities

Star history

Star History Chart

Contributors

December 31, 1969  23:59:59

Model components of the Llama Stack APIs


Llama Stack

PyPI version PyPI - Downloads Discord

This repository contains the Llama Stack API specifications as well as API Providers and Llama Stack Distributions.

The Llama Stack defines and standardizes the building blocks needed to bring generative AI applications to market. These blocks span the entire development lifecycle: from model training and fine-tuning, through product evaluation, to building and running AI agents in production. Beyond definition, we are building providers for the Llama Stack APIs. These were developing open-source versions and partnering with providers, ensuring developers can assemble AI solutions using consistent, interlocking pieces across platforms. The ultimate goal is to accelerate innovation in the AI space.

The Stack APIs are rapidly improving, but still very much work in progress and we invite feedback as well as direct contributions.

APIs

The Llama Stack consists of the following set of APIs:

  • Inference
  • Safety
  • Memory
  • Agentic System
  • Evaluation
  • Post Training
  • Synthetic Data Generation
  • Reward Scoring

Each of the APIs themselves is a collection of REST endpoints.

API Providers

A Provider is what makes the API real -- they provide the actual implementation backing the API.

As an example, for Inference, we could have the implementation be backed by open source libraries like [ torch | vLLM | TensorRT ] as possible options.

A provider can also be just a pointer to a remote REST service -- for example, cloud providers or dedicated inference providers could serve these APIs.

Llama Stack Distribution

A Distribution is where APIs and Providers are assembled together to provide a consistent whole to the end application developer. You can mix-and-match providers -- some could be backed by local code and some could be remote. As a hobbyist, you can serve a small model locally, but can choose a cloud provider for a large model. Regardless, the higher level APIs your app needs to work with don't need to change at all. You can even imagine moving across the server / mobile-device boundary as well always using the same uniform set of APIs for developing Generative AI applications.

Supported Llama Stack Implementations

API Providers

API Provider Builder Environments Agents Inference Memory Safety Telemetry
Meta Reference Single Node
Fireworks Hosted
AWS Bedrock Hosted
Together Hosted
Ollama Single Node
TGI Hosted and Single Node
Chroma Single Node
PG Vector Single Node
PyTorch ExecuTorch On-device iOS

Distributions

Distribution Provider Docker Inference Memory Safety Telemetry
Meta Reference Local GPU, Local CPU
Dell-TGI Local TGI + Chroma

Installation

You can install this repository as a package with pip install llama-stack

If you want to install from source:

mkdir -p ~/local
cd ~/local
git clone [email protected]:meta-llama/llama-stack.git

conda create -n stack python=3.10
conda activate stack

cd llama-stack
$CONDA_PREFIX/bin/pip install -e .

The Llama CLI

The llama CLI makes it easy to work with the Llama Stack set of tools, including installing and running Distributions, downloading models, studying model prompt formats, etc. Please see the CLI reference for details. Please see the Getting Started guide for running a Llama Stack server.

Llama Stack Client SDK

Check out our client SDKs for connecting to Llama Stack server in your preferred language, you can choose from python, node, swift, and kotlin programming languages to quickly build your applications.

December 31, 1969  23:59:59

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚


exo logo

exo: Run your own AI cluster at home with everyday devices. Maintained by exo labs.

Discord | Telegram | X

GitHub Repo stars Tests License: GPL v3


Forget expensive NVIDIA GPUs, unify your existing devices into one powerful GPU: iPhone, iPad, Android, Mac, Linux, pretty much any device!

Update: exo is hiring. See here for more details.

Get Involved

exo is experimental software. Expect bugs early on. Create issues so they can be fixed. The exo labs team will strive to resolve issues quickly.

We also welcome contributions from the community. We have a list of bounties in this sheet.

Features

Wide Model Support

exo supports different models including LLaMA (MLX and tinygrad), Mistral, LlaVA, Qwen and Deepseek.

Dynamic Model Partitioning

exo optimally splits up models based on the current network topology and device resources available. This enables you to run larger models than you would be able to on any single device.

Automatic Device Discovery

exo will automatically discover other devices using the best method available. Zero manual configuration.

ChatGPT-compatible API

exo provides a ChatGPT-compatible API for running models. It's a one-line change in your application to run models on your own hardware using exo.

Device Equality

Unlike other distributed inference frameworks, exo does not use a master-worker architecture. Instead, exo devices connect p2p. As long as a device is connected somewhere in the network, it can be used to run models.

Exo supports different partitioning strategies to split up a model across devices. The default partitioning strategy is ring memory weighted partitioning. This runs an inference in a ring where each device runs a number of model layers proportional to the memory of the device.

ring topology

Installation

The current recommended way to install exo is from source.

Prerequisites

Hardware Requirements

  • The only requirement to run exo is to have enough memory across all your devices to fit the entire model into memory. For example, if you are running llama 3.1 8B (fp16), you need 16GB of memory across all devices. Any of the following configurations would work since they each have more than 16GB of memory in total:
    • 2 x 8GB M3 MacBook Airs
    • 1 x 16GB NVIDIA RTX 4070 Ti Laptop
    • 2 x Raspberry Pi 400 with 4GB of RAM each (running on CPU) + 1 x 8GB Mac Mini
  • exo is designed to run on devices with heterogeneous capabilities. For example, you can have some devices with powerful GPUs and others with integrated GPUs or even CPUs. Adding less capable devices will slow down individual inference latency but will increase the overall throughput of the cluster.

From source

git clone https://github.com/exo-explore/exo.git
cd exo
pip install -e .
# alternatively, with venv
source install.sh

Troubleshooting

  • If running on Mac, MLX has an install guide with troubleshooting steps.

Performance

  • There are a number of things users have empirically found to improve performance on Apple Silicon Macs:
  1. Upgrade to the latest version of MacOS 15.
  2. Run ./configure_mlx.sh. This runs commands to optimize GPU memory allocation on Apple Silicon Macs.

Documentation

Example Usage on Multiple MacOS Devices

Device 1:

exo

Device 2:

exo

That's it! No configuration required - exo will automatically discover the other device(s).

exo starts a ChatGPT-like WebUI (powered by tinygrad tinychat) on http://localhost:8000

For developers, exo also starts a ChatGPT-compatible API endpoint on http://localhost:8000/v1/chat/completions. Examples with curl:

Llama 3.2 3B:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "llama-3.2-3b",
     "messages": [{"role": "user", "content": "What is the meaning of exo?"}],
     "temperature": 0.7
   }'

Llama 3.1 405B:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "llama-3.1-405b",
     "messages": [{"role": "user", "content": "What is the meaning of exo?"}],
     "temperature": 0.7
   }'

Llava 1.5 7B (Vision Language Model):

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
     "model": "llava-1.5-7b-hf",
     "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What are these?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "http://images.cocodataset.org/val2017/000000039769.jpg"
            }
          }
        ]
      }
    ],
     "temperature": 0.0
   }'

Example Usage on Multiple Heterogenous Devices (MacOS + Linux)

Device 1 (MacOS):

exo --inference-engine tinygrad

Here we explicitly tell exo to use the tinygrad inference engine.

Device 2 (Linux):

exo

Linux devices will automatically default to using the tinygrad inference engine.

You can read about tinygrad-specific env vars here. For example, you can configure tinygrad to use the cpu by specifying CLANG=1.

Example Usage on a single device with "exo run" command

exo run llama-3.2-3b

With a custom prompt:

exo run llama-3.2-3b --prompt "What is the meaning of exo?"

Debugging

Enable debug logs with the DEBUG environment variable (0-9).

DEBUG=9 exo

For the tinygrad inference engine specifically, there is a separate DEBUG flag TINYGRAD_DEBUG that can be used to enable debug logs (1-6).

TINYGRAD_DEBUG=2 exo

Known Issues

  • On some versions of MacOS/Python, certificates are not installed properly which can lead to SSL errors (e.g. SSL error with huggingface.co). To fix this, run the Install Certificates command, usually:
/Applications/Python 3.x/Install Certificates.command
  • 🚧 As the library is evolving so quickly, the iOS implementation has fallen behind Python. We have decided for now not to put out the buggy iOS version and receive a bunch of GitHub issues for outdated code. We are working on solving this properly and will make an announcement when it's ready. If you would like access to the iOS implementation now, please email [email protected] with your GitHub username explaining your use-case and you will be granted access on GitHub.

Inference Engines

exo supports the following inference engines:

Networking Modules

December 31, 1969  23:59:59

Godot Engine – Multi-platform 2D and 3D game engine


Godot Engine

Godot Engine logo

2D and 3D cross-platform game engine

Godot Engine is a feature-packed, cross-platform game engine to create 2D and 3D games from a unified interface. It provides a comprehensive set of common tools, so that users can focus on making games without having to reinvent the wheel. Games can be exported with one click to a number of platforms, including the major desktop platforms (Linux, macOS, Windows), mobile platforms (Android, iOS), as well as Web-based platforms and consoles.

Free, open source and community-driven

Godot is completely free and open source under the very permissive MIT license. No strings attached, no royalties, nothing. The users' games are theirs, down to the last line of engine code. Godot's development is fully independent and community-driven, empowering users to help shape their engine to match their expectations. It is supported by the Godot Foundation not-for-profit.

Before being open sourced in February 2014, Godot had been developed by Juan Linietsky and Ariel Manzur (both still maintaining the project) for several years as an in-house engine, used to publish several work-for-hire titles.

Screenshot of a 3D scene in the Godot Engine editor

Getting the engine

Binary downloads

Official binaries for the Godot editor and the export templates can be found on the Godot website.

Compiling from source

See the official docs for compilation instructions for every supported platform.

Community and contributing

Godot is not only an engine but an ever-growing community of users and engine developers. The main community channels are listed on the homepage.

The best way to get in touch with the core engine developers is to join the Godot Contributors Chat.

To get started contributing to the project, see the contributing guide. This document also includes guidelines for reporting bugs.

Documentation and demos

The official documentation is hosted on Read the Docs. It is maintained by the Godot community in its own GitHub repository.

The class reference is also accessible from the Godot editor.

We also maintain official demos in their own GitHub repository as well as a list of awesome Godot community resources.

There are also a number of other learning resources provided by the community, such as text and video tutorials, demos, etc. Consult the community channels for more information.

Code Triagers Badge Translate on Weblate TODOs

December 31, 1969  23:59:59

⚡ Open-source workflow automation platform. Orchestrate any language using YAML, hundreds of integrations. Alternative to Airflow, n8n, RunDeck, Camunda, Jenkins...


Kestra workflow orchestrator

Event-Driven Declarative Orchestration Platform

Last Version License Github star
Kestra infinitely scalable orchestration and scheduling platform Slack

twitter   linkedin   youtube  

Get started in 4 minutes with Kestra

Click on the image to learn how to get started with Kestra in 4 minutes.

🌟 What is Kestra?

Kestra is an open-source, event-driven orchestration platform that makes both scheduled and event-driven workflows easy. By bringing Infrastructure as Code best practices to data, process, and microservice orchestration, you can build reliable workflows directly from the UI in just a few lines of YAML.

Key Features:

  • Everything as Code and from the UI: keep workflows as code with a Git Version Control integration, even when building them from the UI.
  • Event-Driven & Scheduled Workflows: automate both scheduled and real-time event-driven workflows via a simple trigger definition.
  • Declarative YAML Interface: define workflows using a simple configuration in the built-in code editor.
  • Rich Plugin Ecosystem: hundreds of plugins built in to extract data from any database, cloud storage, or API, and run scripts in any language.
  • Intuitive UI & Code Editor: build and visualize workflows directly from the UI with syntax highlighting, auto-completion and real-time syntax validation.
  • Scalable: designed to handle millions of workflows, with high availability and fault tolerance.
  • Version Control Friendly: write your workflows from the built-in code Editor and push them to your preferred Git branch directly from Kestra, enabling best practices with CI/CD pipelines and version control systems.
  • Structure & Resilience: tame chaos and bring resilience to your workflows with namespaces, labels, subflows, retries, timeout, error handling, inputs, outputs that generate artifacts in the UI, variables, conditional branching, advanced scheduling, event triggers, backfills, dynamic tasks, sequential and parallel tasks, and skip tasks or triggers when needed by setting the flag disabled to true.

🧑‍💻 The YAML definition gets automatically adjusted any time you make changes to a workflow from the UI or via an API call. Therefore, the orchestration logic is always managed declaratively in code, even if you modify your workflows in other ways (UI, CI/CD, Terraform, API calls).

Adding new tasks in the UI


🚀 Quick Start

Try the Live Demo

Try Kestra with our Live Demo. No installation required!

Get Started Locally in 5 Minutes

Launch Kestra in Docker

Make sure that Docker is running. Then, start Kestra in a single command:

docker run --pull=always --rm -it -p 8080:8080 --user=root \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /tmp:/tmp kestra/kestra:latest server local

Check our Installation Guide for other deployment options (Docker Compose, Podman, Kubernetes, AWS, GCP, Azure, and more).

Access the Kestra UI at http://localhost:8080 and start building your first flow!

Your First Hello World Flow

Create a new flow with the following content:

id: hello_world
namespace: dev

tasks:
  - id: say_hello
    type: io.kestra.plugin.core.log.Log
    message: "Hello, World!"

Run the flow and see the output in the UI!


🧩 Plugin Ecosystem

Kestra's functionality is extended through a rich ecosystem of plugins that empower you to run tasks anywhere and code in any language, including Python, Node.js, R, Go, Shell, and more. Here's how Kestra plugins enhance your workflows:

  • Run Anywhere:

    • Local or Remote Execution: Execute tasks on your local machine, remote servers via SSH, or scale out to serverless containers using Task Runners.
    • Docker and Kubernetes Support: Seamlessly run Docker containers within your workflows or launch Kubernetes jobs to handle compute-intensive workloads.
  • Code in Any Language:

    • Scripting Support: Write scripts in your preferred programming language. Kestra supports Python, Node.js, R, Go, Shell, and others, allowing you to integrate existing codebases and deployment patterns.
    • Flexible Automation: Execute shell commands, run SQL queries against various databases, and make HTTP requests to interact with APIs.
  • Event-Driven and Real-Time Processing:

    • Real-Time Triggers: React to events from external systems in real-time, such as file arrivals, new messages in message buses (Kafka, Redis, Pulsar, AMQP, MQTT, NATS, AWS SQS, Google Pub/Sub, Azure Event Hubs), and more.
    • Custom Events: Define custom events to trigger flows based on specific conditions or external signals, enabling highly responsive workflows.
  • Cloud Integrations:

    • AWS, Google Cloud, Azure: Integrate with a variety of cloud services to interact with storage solutions, messaging systems, compute resources, and more.
    • Big Data Processing: Run big data processing tasks using tools like Apache Spark or interact with analytics platforms like Google BigQuery.
  • Monitoring and Notifications:

    • Stay Informed: Send messages to Slack channels, email notifications, or trigger alerts in PagerDuty to keep your team updated on workflow statuses.

Kestra's plugin ecosystem is continually expanding, allowing you to tailor the platform to your specific needs. Whether you're orchestrating complex data pipelines, automating scripts across multiple environments, or integrating with cloud services, there's likely a plugin to assist. And if not, you can always build your own plugins to extend Kestra's capabilities.

🧑‍💻 Note: This is just a glimpse of what Kestra plugins can do. Explore the full list on our Plugins Page.


📚 Key Concepts

  • Flows: the core unit in Kestra, representing a workflow composed of tasks.
  • Tasks: individual units of work, such as running a script, moving data, or calling an API.
  • Namespaces: logical grouping of flows for organization and isolation.
  • Triggers: schedule or events that initiate the execution of flows.
  • Inputs & Variables: parameters and dynamic data passed into flows and tasks.

🎨 Build Workflows Visually

Kestra provides an intuitive UI that allows you to interactively build and visualize your workflows:

  • Drag-and-Drop Interface: add and rearrange tasks from the Topology Editor.
  • Real-Time Validation: instant feedback on your workflow's syntax and structure to catch errors early.
  • Auto-Completion: smart suggestions as you type.
  • Live Topology View: see your workflow as a Directed Acyclic Graph (DAG) that updates in real-time.

🔧 Extensible and Developer-Friendly

Plugin Development

Create custom plugins to extend Kestra's capabilities. Check out our Plugin Developer Guide to get started.

Infrastructure as Code

  • Version Control: store your flows in Git repositories.
  • CI/CD Integration: automate deployment of flows using CI/CD pipelines.
  • Terraform Provider: manage Kestra resources with the official Terraform provider.

🌐 Join the Community

Stay connected and get support:

  • Slack: Join our Slack community to ask questions and share ideas.
  • Twitter: Follow us on Twitter for the latest updates.
  • YouTube: Subscribe to our YouTube channel for tutorials and webinars.
  • LinkedIn: Connect with us on LinkedIn.

🤝 Contributing

We welcome contributions of all kinds!


📄 License

Kestra is licensed under the Apache 2.0 License © Kestra Technologies.


⭐️ Stay Updated

Give our repository a star to stay informed about the latest features and updates!

Star the Repo


Thank you for considering Kestra for your workflow orchestration needs. We can't wait to see what you'll build!

December 31, 1969  23:59:59

Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automate these high-value generic R&D processes through our open source R&D automation tool RD-Agent, which let AI drive data-driven AI.


RA-Agent logo

🖥️ Live Demo | 🎥 Demo Video ▶️YouTube | 📖 Documentation | 📃 Papers

CI CodeQL Dependabot Updates Lint PR Title Release.yml Platform PyPI PyPI - Python Version Release GitHub pre-commit Checked with mypy Ruff Chat Readthedocs Preview

📰 News

🗞️ News 📝 Description
Official WeChat group release We created a WeChat group, welcome to join! (🗪QR Code)
Official Discord release We launch our first chatting channel in Discord (🗪Chat)
First release RDAgent is released on GitHub

🌟 Introduction

Our focused scenario

RDAgent aims to automate the most critical and valuable aspects of the industrial R&D process, and we begin with focusing on the data-driven scenarios to streamline the development of models and data. Methodologically, we have identified a framework with two key components: 'R' for proposing new ideas and 'D' for implementing them. We believe that the automatic evolution of R&D will lead to solutions of significant industrial value.

R&D is a very general scenario. The advent of RDAgent can be your

You can click the links above to view the demo. We're continuously adding more methods and scenarios to the project to enhance your R&D processes and boost productivity.

Additionally, you can take a closer look at the examples in our 🖥️ Live Demo.

⚡ Quick start

You can try above demos by running the following command:

🐳 Docker installation.

Users must ensure Docker is installed before attempting most scenarios. Please refer to the official 🐳Docker page for installation instructions.

🐍 Create a Conda Environment

  • Create a new conda environment with Python (3.10 and 3.11 are well-tested in our CI):
    conda create -n rdagent python=3.10
    
  • Activate the environment:
    conda activate rdagent
    

🛠️ Install the RDAgent

  • You can directly install the RDAgent package from PyPI:
    pip install rdagent
    

⚙️ Configuration

  • You have to config your GPT model in the .env
    cat << EOF  > .env
    OPENAI_API_KEY=<your_api_key>
    # EMBEDDING_MODEL=text-embedding-3-small
    CHAT_MODEL=gpt-4-turbo
    EOF
    

🚀 Run the Application

The 🖥️ Live Demo is implemented by the following commands(each item represents one demo, you can select the one you prefer):

  • Run the Automated Quantitative Trading & Iterative Factors Evolution: Qlib self-loop factor proposal and implementation application

    rdagent fin_factor
    
  • Run the Automated Quantitative Trading & Iterative Model Evolution: Qlib self-loop model proposal and implementation application

    rdagent fin_model
    
  • Run the Automated Medical Prediction Model Evolution: Medical self-loop model proposal and implementation application

    (1) Apply for an account at PhysioNet.
    (2) Request access to FIDDLE preprocessed data: FIDDLE Dataset.
    (3) Place your username and password in .env.

    cat << EOF  >> .env
    DM_USERNAME=<your_username>
    DM_PASSWORD=<your_password>
    EOF
    
    rdagent med_model
    
  • Run the Automated Quantitative Trading & Factors Extraction from Financial Reports: Run the Qlib factor extraction and implementation application based on financial reports

    # 1. Generally, you can run this scenario using the following command:
    rdagent fin_factor_report --report_folder=<Your financial reports folder path>
    
    # 2. Specifically, you need to prepare some financial reports first. You can follow this concrete example:
    wget https://github.com/SunsetWolf/rdagent_resource/releases/download/reports/all_reports.zip
    unzip all_reports.zip -d git_ignore_folder/reports
    rdagent fin_factor_report --report_folder=git_ignore_folder/reports
    
  • Run the Automated Model Research & Development Copilot: model extraction and implementation application

    # 1. Generally, you can run your own papers/reports with the following command:
    rdagent general_model <Your paper URL>
    
    # 2. Specifically, you can do it like this. For more details and additional paper examples, use `rdagent general_model -h`:
    rdagent general_model  "https://arxiv.org/pdf/2210.09789"
    

🖥️ Monitor the Application Results

  • You can serve our demo app to monitor the RD loop by running the following command:
    rdagent ui --port 80 --log_dir <your log folder like "log/">
    

🏭 Scenarios

We have applied RD-Agent to multiple valuable data-driven industrial scenarios.

🎯 Goal: Agent for Data-driven R&D

In this project, we are aiming to build an Agent to automate Data-Driven R&D that can

  • 📄 Read real-world material (reports, papers, etc.) and extract key formulas, descriptions of interested features and models, which are the key components of data-driven R&D .
  • 🛠️ Implement the extracted formulas (e.g., features, factors, and models) in runnable codes.
    • Due to the limited ability of LLM in implementing at once, build an evolving process for the agent to improve performance by learning from feedback and knowledge.
  • 💡 Propose new ideas based on current knowledge and observations.

📈 Scenarios/Demos

In the two key areas of data-driven scenarios, model implementation and data building, our system aims to serve two main roles: 🦾Copilot and 🤖Agent.

  • The 🦾Copilot follows human instructions to automate repetitive tasks.
  • The 🤖Agent, being more autonomous, actively proposes ideas for better results in the future.

The supported scenarios are listed below:

Scenario/Target Model Implementation Data Building
💹 Finance 🤖 Iteratively Proposing Ideas & Evolving▶️YouTube 🤖 Iteratively Proposing Ideas & Evolving ▶️YouTube
🦾 Auto reports reading & implementation▶️YouTube
🩺 Medical 🤖 Iteratively Proposing Ideas & Evolving▶️YouTube -
🏭 General 🦾 Auto paper reading & implementation▶️YouTube -

Different scenarios vary in entrance and configuration. Please check the detailed setup tutorial in the scenarios documents.

Here is a gallery of successful explorations (5 traces showed in 🖥️ Live Demo). You can download and view the execution trace using the command below:

rdagent ui --port 80 --log_dir ./demo_traces

Please refer to 📖readthedocs_scen for more details of the scenarios.

⚙️ Framework

Framework-RDAgent

Automating the R&D process in data science is a highly valuable yet underexplored area in industry. We propose a framework to push the boundaries of this important research field.

The research questions within this framework can be divided into three main categories:

Research Area Paper/Work List
Benchmark the R&D abilities Benchmark
Idea proposal: Explore new ideas or refine existing ones Research
Ability to realize ideas: Implement and execute ideas Development

We believe that the key to delivering high-quality solutions lies in the ability to evolve R&D capabilities. Agents should learn like human experts, continuously improving their R&D skills.

More documents can be found in the 📖 readthedocs.

📃 Paper/Work list

📊 Benchmark

@misc{chen2024datacentric,
    title={Towards Data-Centric Automatic R&D},
    author={Haotian Chen and Xinjie Shen and Zeqi Ye and Wenjun Feng and Haoxue Wang and Xiao Yang and Xu Yang and Weiqing Liu and Jiang Bian},
    year={2024},
    eprint={2404.11276},
    archivePrefix={arXiv},
    primaryClass={cs.AI}
}

image

🔍 Research

In a data mining expert's daily research and development process, they propose a hypothesis (e.g., a model structure like RNN can capture patterns in time-series data), design experiments (e.g., finance data contains time-series and we can verify the hypothesis in this scenario), implement the experiment as code (e.g., Pytorch model structure), and then execute the code to get feedback (e.g., metrics, loss curve, etc.). The experts learn from the feedback and improve in the next iteration.

Based on the principles above, we have established a basic method framework that continuously proposes hypotheses, verifies them, and gets feedback from the real-world practice. This is the first scientific research automation framework that supports linking with real-world verification.

For more detail, please refer to our 🖥️ Live Demo page.

🛠️ Development

@misc{yang2024collaborative,
    title={Collaborative Evolving Strategy for Automatic Data-Centric Development},
    author={Xu Yang and Haotian Chen and Wenjun Feng and Haoxue Wang and Zeqi Ye and Xinjie Shen and Xiao Yang and Shizhao Sun and Weiqing Liu and Jiang Bian},
    year={2024},
    eprint={2407.18690},
    archivePrefix={arXiv},
    primaryClass={cs.AI}
}

image

🤝 Contributing

📝 Guidelines

This project welcomes contributions and suggestions. Contributing to this project is straightforward and rewarding. Whether it's solving an issue, addressing a bug, enhancing documentation, or even correcting a typo, every contribution is valuable and helps improve RDAgent.

To get started, you can explore the issues list, or search for TODO: comments in the codebase by running the command grep -r "TODO:".

Before we released RD-Agent as an open-source project on GitHub, it was an internal project within our group. Unfortunately, the internal commit history was not preserved when we removed some confidential code. As a result, some contributions from our group members, including Haotian Chen, Wenjun Feng, Haoxue Wang, Zeqi Ye, Xinjie Shen, and Jinhui Li, were not included in the public commits.

⚖️ Legal disclaimer

The RD-agent is provided “as is”, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. The RD-agent is aimed to facilitate research and development process in the financial industry and not ready-to-use for any financial investment or advice. Users shall independently assess and test the risks of the RD-agent in a specific use scenario, ensure the responsible use of AI technology, including but not limited to developing and integrating risk mitigation measures, and comply with all applicable laws and regulations in all applicable jurisdictions. The RD-agent does not provide financial opinions or reflect the opinions of Microsoft, nor is it designed to replace the role of qualified financial professionals in formulating, assessing, and approving finance products. The inputs and outputs of the RD-agent belong to the users and users shall assume all liability under any theory of liability, whether in contract, torts, regulatory, negligence, products liability, or otherwise, associated with use of the RD-agent and any inputs and outputs thereof.

December 31, 1969  23:59:59

DiceDB is a redis-compliant, in-memory, real-time, and reactive database optimized for modern hardware and building and scaling truly real-time applications.


DiceDB

slatedb.io Discord Docs

DiceDB is an in-memory, real-time, and reactive database with Redis and SQL support optimized for modern hardware and building real-time applications.

We are looking for Early Design Partners, so, if you want to evaluate DiceDB, block our calendar. always up for a chat.

Note: DiceDB is still in development and it supports a subset of Redis commands. So, please do not use it in production. But, feel free to go through the open issues and contribute to help us speed up the development.

Want to contribute?

We have multiple repositories where you can contribute. So, as per your interest, you can pick one and build a deeper understanding of the project on the go.

How is it different from Redis?

Although DiceDB is a drop-in replacement of Redis, which means almost no learning curve and switching does not require any code change, it still differs in two key aspects and they are

  1. DiceDB is multithreaded and follows shared-nothing architecture.
  2. DiceDB supports a new command called QWATCH that lets clients listen to a SQL query and get notified in real-time whenever something changes.

With this, you can build truly real-time applications like Leaderboard with simple SQL queries.

Leaderboard with DiceDB

Get started

Using Docker

The easiest way to get started with DiceDB is using Docker by running the following command.

$ docker run -p 7379:7379 dicedb/dicedb

The above command will start the DiceDB server running locally on the port 7379 and you can connect to it using DiceDB CLI and SDKs, or even Redis CLIs and SDKs.

Note: Given it is a drop-in replacement of Redis, you can also use any Redis CLI and SDK to connect to DiceDB.

Multi-Threading Mode (Experimental)

Multi-threading is currently under active development. To run the server with multi-threading enabled, follow these steps:

$ git clone https://github.com/dicedb/dice
$ cd dice
$ go run main.go --enable-multithreading=true

Note: Only the following commands are optimized for multithreaded execution: PING, AUTH, SET, GET, GETSET, ABORT

Setting up DiceDB from source for development and contributions

To run DiceDB for local development or running from source, you will need

  1. Golang
  2. Any of the below supported platform environments:
    1. Linux based environment
    2. OSX (Darwin) based environment
    3. WSL under Windows
$ git clone https://github.com/dicedb/dice
$ cd dice
$ go run main.go
  1. Install GoLangCI
$ sudo su
$ curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b /bin v1.60.1

Live Development Server

DiceDB provides a hot-reloading development environment, which allows you to instantly view your code changes in a live server. This functionality is supported by Air

To Install Air on your system you have the following options.

  1. If you're on go 1.22+
go install github.com/air-verse/air@latest
  1. Install the Air binary
# binary will be installed at $(go env GOPATH)/bin/air
curl -sSfL https://raw.githubusercontent.com/air-verse/air/master/install.sh | sh -s -- -b $(go env GOPATH)/bin

Once air is installed you can verify the installation using the command air -v

To run the live DiceDB server for local development:

$ git clone https://github.com/dicedb/dice
$ cd dice
$ DICE_ENV=dev air

The DICE_ENV environment variable is used set the environment, by default it is treated as production. dev is used to get pretty printed logs and lower log level.

Local Setup with Custom Config

By default, DiceDB will look for the configuration file at /etc/dice/config.toml. (Linux, Darwin, and WSL)

$ # set up configuration file # (optional but recommended)
$ sudo mkdir -p /etc/dice
$ sudo chown root:$USER /etc/dice
$ sudo chmod 775 /etc/dice # or 777 if you are the only user
$ git clone https://github.com/DiceDB/dice.git
$ cd dice
$ go run main.go -init-config

For Windows Users:

If you're using Windows, it is recommended to use Windows Subsystem for Linux (WSL) or WSL 2 to run the above commands seamlessly in a Linux-like environment.

Alternatively, you can:

Create a directory at C:\ProgramData\dice and run the following command to generate the configuration file:

go run main.go -init-config

For a smoother experience, we highly recommend using WSL.

Additional Configuration Options:

If you'd like to use a different location, you can specify a custom configuration file path with the -c flag:

go run main.go -c /path/to/config.toml

If you'd like to output the configuration file to a specific location, you can specify a custom output path with the -o flag:

go run main.go -o /path/of/output/dir

Setting up CLI

The best way to connect to DiceDB is using DiceDB CLI and you can install it by running the following command.

$ pip install dicedb-cli

Because DiceDB speaks Redis dialect, you can connect to it with any Redis Client and SDK also. But if you are planning to use the QWATCH feature then you need to use the DiceDB CLI.

Running Tests

Unit tests and integration tests are essential for ensuring correctness and in the case of DiceDB, both types of tests are available to validate its functionality.

For unit testing, you can execute individual unit tests by specifying the name of the test function using the TEST_FUNC environment variable and running the make unittest-one command. Alternatively, running make unittest will execute all unit tests.

Executing one unit test

$ TEST_FUNC=<name of the test function> make unittest-one
$ TEST_FUNC=TestByteList make unittest-one

Running all unit tests

$ make unittest

Integration tests, on the other hand, involve starting up the DiceDB server and running a series of commands to verify the expected end state and output. To execute a single integration test, you can set the TEST_FUNC environment variable to the name of the test function and run make test-one. Running make test will execute all integration tests.

Executing a single integration test

$ TEST_FUNC=<name of the test function> make test-one
$ TEST_FUNC=TestSet make test-one

Running all integration tests

$ make test

Work to add more tests in DiceDB is in progress, and we will soon port the test Redis suite to this codebase to ensure full compatibility.

Running Benchmark

$ go test -test.bench <pattern>
$ go test -test.bench BenchmarkListRedis -benchmem

Getting Started

To get started with building and contributing to DiceDB, please refer to the issues created in this repository.

Docs

Netlify Status Built with Starlight

We use Astro framework to power the dicedb.io website and Starlight to power the docs. Once you have NodeJS installed, fire the following commands to get your local version of dicedb.io running.

$ cd docs
$ npm install
$ npm run dev

Once the server starts, visit http://localhost:4321/ in your favourite browser. This runs with a hot reload which means any changes you make in the website and the documentation can be instantly viewed on the browser.

To build and deploy

$ cd docs
$ npm run build

Docs directory structure

  1. docs/src/content/docs/commands is where all the commands are documented
  2. docs/src/content/docs/tutorials is where all the tutorials are documented

The story

DiceDB started as a re-implementation of Redis in Golang and the idea was to - build a DB from scratch and understand the micro-nuances that come with its implementation. The database does not aim to replace Redis, instead, it will fit in and optimize itself for multicore computations running on a single-threaded event loop.

How to contribute

The Code Contribution Guidelines are published at CONTRIBUTING.md; please read them before you start making any changes. This would allow us to have a consistent standard of coding practices and developer experience.

Contributors can join the Discord Server for quick collaboration.

Community

Contributors

Troubleshoot

Forcefully killing the process

$ sudo netstat -atlpn | grep :7379
$ sudo kill -9 <process_id>
December 31, 1969  23:59:59

Architected for speed. Automated for easy. Monitoring and troubleshooting, transformed!


Netdata Netdata

Monitor your servers, containers, and applications,
in high-resolution and in real-time.


GitHub Stars
Live Demo Latest release Latest nightly build
Discourse topics GitHub Discussions

Visit the Project's Home Page


MENU: GETTING STARTED | HOW IT WORKS | FAQ | DOCS | COMMUNITY | CONTRIBUTE | LICENSE

Important 💡
People get addicted to Netdata. Once you use it on your systems, there's no going back!

Netdata is a high-performance, cloud-native, and on-premises observability platform designed to monitor metrics and logs with unparalleled efficiency. It delivers a simpler, faster, and significantly easier approach to real-time, low-latency monitoring for systems, containers, and applications. Netdata requires zero-configuration to get started, offering a powerful and comprehensive monitoring experience, out of the box.

Netdata is also known for its cost-efficient, distributed design. Unlike traditional monitoring solutions that centralize data, Netdata distributes the code. Instead of funneling all data into a few central databases, Netdata processes data at the edge, keeping it close to the source. The smart open-source Netdata Agent acts as a distributed database, enabling the construction of complex observability pipelines with modular, Lego-like simplicity.

Netdata provides A.I. insights for all monitored data, training machine learning models directly at the edge. This allows for fully automated and unsupervised anomaly detection, and with its intuitive APIs and UIs, users can quickly perform root cause analysis and troubleshoot issues, identifying correlations and gaining deeper insights into their infrastructure.

The Netdata Ecosystem

Netdata is built on three core components:

  1. Netdata Agent (usually called just "Netdata"): This open-source component is the heart of the Netdata ecosystem, handling data collection, storage (embedded database), querying, machine learning, exporting, and alerting of observability data. All observability data and features a Netdata ecosystem offers, are managed by the Netdata Agent. It runs in physical and virtual servers, cloud environments, Kubernetes clusters, and edge/IoT devices and is carefully optimized to have zero impact on production systems and applications.

    Netdata Agent License: GPL v3+ CII Best Practices Coverity Scan

  2. Netdata Cloud: Enhancing the Netdata Agent, Netdata Cloud offers enterprise features such as user management, role-based access control, horizontal scalability, alert and notification management, access from anywhere, and more. Netdata Cloud does not centralize or store observability data.

    Netdata Cloud is a commercial product, available as an on-premises installation, or a SaaS solution, with a free community tier.

  3. Netdata UI: The user interface that powers all dashboards, data visualization, and configuration.

    While closed-source, it is free to use with both Netdata Agents and Netdata Cloud, via their public APIs. It is included in the binary packages offered by Netdata and its latest version is publicly available via a CDN.

    Netdata UI License: NCUL1

Netdata scales effortlessly from a single server to thousands, even in complex, multi-cloud or hybrid environments, with the ability to retain data for years.

Key characteristics of the Netdata Agent

  • 💥 Collects data from 800+ integrations
    Operating system metrics, container metrics, virtual machines, hardware sensors, applications metrics, OpenMetrics exporters, StatsD, and logs. OpenTelemetry is on its way to be included (currently being developed)...

  • 💪 Real-Time, Low-Latency, High-Resolution
    All data are collected per second and are made available on the APIs for visualization, immediately after data collection (1-second latency, data collection to visualization).

  • 😶🌫 AI across the board
    Trains multiple Machine-Learning (ML) models at the edge, for each metric collected and uses AI to detect anomalies based on the past behavior of each metric.

  • 📜 systemd-journald Logs
    Includes tools to efficiently convert plain text log (text, csv, logfmt, json) files to structured systemd-journald entries (log2journal, systemd-cat-native) and queries systemd-journal files directly enabling powerful logs visualization dashboards. The Netdata Agents eliminate the need to centralize logs and provide all the functions to work with logs directly at the edge.

  • Lego like, Observability Pipelines
    Netdata Agents can be linked to together (in parent-child relationships), to build observability centralization points within your infrastructure, allowing you to control data replication and retention at multiple levels.

  • 🔥 Fully Automated Powerful Visualization
    Using the NIDL (Nodes, Instances, Dimensions & Labels) data model, the Netdata Agent enables the creation of fully automated dashboards, providing corellated visualization of all metrics, allowing you to understand any dataset at first sight, but also to filter, slice and dice the data directly on the dashboards, without the need to learn a query language.

    Note: the Netdata UI is closed-source, but free to use with Netdata Agents and Netdata Cloud.

  • 🔔 Out of box Alerts
    Comes with hundreds of alerts out of the box to detect common issues and pitfalls, revealing issues that can easily go unnoticed. It supports several notification methods to let you know when your attention is needed.

  • 😎 Low Maintenance
    Fully automated in every aspect: automated dashboards, out-of-the-box alerts, auto-detection and auto-discovery of metrics, zero-touch machine-learning, easy scalability and high availability, and CI/CD friendly.

  • Open and Extensible
    Netdata is a modular platform that can be extended in all possible ways, and it also integrates nicely with other monitoring solutions.

What can be monitored with the Netdata Agent

Netdata monitors all the following:

Component Linux FreeBSD macOS Windows*
System Resources
CPU, Memory and system shared resources
Full Yes Yes Yes
Storage
Disks, Mount points, Filesystems, RAID arrays
Full Basic Basic Basic
Network
Network Interfaces, Protocols, Firewall, etc
Full Basic Basic Basic
Hardware & Sensors
Fans, Temperatures, Controllers, GPUs, etc
Full Some Some Some
O/S Services
Resources, Performance and Status
Yes
systemd
- - Basic
Logs Yes
systemd-journal
- - -
Processes
Resources, Performance, OOM, and more
Yes Yes Yes Yes
Network Connections
Live TCP and UDP sockets per PID
Yes - - -
Containers
Docker/containerd, LXC/LXD, Kubernetes, etc
Yes - - -
VMs (from the host)
KVM, qemu, libvirt, Proxmox, etc
Yes
cgroups
- - Yes
Hyper-V
Synthetic Checks
Test APIs, TCP ports, Ping, Certificates, etc
Yes Yes Yes Yes
Packaged Applications
nginx, apache, postgres, redis, mongodb,
and hundreds more
Yes Yes Yes Yes
Cloud Provider Infrastructure
AWS, GCP, Azure, and more
Yes Yes Yes Yes
Custom Applications
OpenMetrics, StatsD and soon OpenTelemetry
Yes Yes Yes Yes

When the Netdata Agent runs on Linux, it monitors every kernel feature available, providing full coverage of all kernel technologies that can be monitored.

The Netdata Agent also provides full enterprise hardware coverage, monitoring all components that provide hardware error reporting, like PCI AER, RAM EDAC, IPMI, S.M.A.R.T., NVMe, Fans, Power, Voltages, and more.

* The Netdata Agent runs on Linux, FreeBSD and macOS. For Windows, we currently rely on Windows Exporter (so a Netdata running on Linux, FreeBSD or macOS is required, next to the monitored Windows servers). However, a Windows version of the Netdata Agent is at its final state for release.


Netdata is the most energy-efficient monitoring tool

Energy Efficiency Energy efficiency

Dec 11, 2023: University of Amsterdam published a study related to the impact of monitoring tools for Docker based systems, aiming to answer 2 questions:

  1. The impact of monitoring on the energy efficiency of Docker-based systems
  2. The impact of monitoring on Docker-based systems?
  • 🚀 Netdata excels in energy efficiency: "... Netdata being the most energy-efficient tool ...", as the study says.
  • 🚀 Netdata excels in CPU Usage, RAM Usage and Execution Time, and has a similar impact in Network Traffic as Prometheus.

The study did not normalize the results based on the number of metrics collected. Given that Netdata usually collects significantly more metrics than the other tools, Netdata managed to outperform the other tools, while ingesting a much higher number of metrics. Read the full study here.


Netdata vs Prometheus

Netdata Netdata

On the same workload, Netdata uses 35% less CPU, 49% less RAM, 12% less bandwidth, 98% less disk I/O, and is 75% more disk space efficient on high resolution metrics storage, while providing more than a year of overall retention on the same disk footprint Prometheus offers 7 days of retention. Read the full analysis in our blog.


 

CNCF CNCF
Netdata actively supports and is a member of the Cloud Native Computing Foundation (CNCF)
 
...and due to your love ❤️, it is one of the most 'd projects in the CNCF landscape!

 

Below is an animated image, but you can see Netdata live!
FRANKFURT | NEWYORK | ATLANTA | SANFRANCISCO | TORONTO | SINGAPORE | BANGALORE
They are clustered Netdata Agent Parents. They all have the same data. Select the one closer to you.
All these run with the default configuration. We only clustered them to have multi-node dashboards.
Note: These demos include the Netdata UI,
which while being closed-source, is free to use with Netdata Agents and Netdata Cloud.

Netdata Agent


Getting Started

User base Servers monitored Sessions served Docker Hub pulls
New users today New machines today Sessions today Docker Hub pulls today

1. Install Netdata everywhere

Netdata can be installed on all Linux, macOS, FreeBSD (and soon on Windows) systems. We provide binary packages for the most popular operating systems and package managers.

Check also the Netdata Deployment Guides to decide how to deploy it in your infrastructure.

By default, you will have immediately available a local dashboard. Netdata starts a web server for its dashboard at port 19999. Open up your web browser of choice and navigate to http://NODE:19999, replacing NODE with the IP address or hostname of your Agent. If installed on localhost, you can access it through http://localhost:19999.

Note: the binary packages we provide, install Netdata UI automatically. Netdata UI is closed-source, but free to use with Netdata Agents and Netdata Cloud.

2. Configure Collectors 💥

Netdata auto-detects and auto-discovers most operating system data sources and applications. However, many data sources require some manual configuration, usually to allow Netdata to get access to the metrics.

  • For a detailed list of the 800+ collectors available, check this guide.
  • To monitor Windows servers and applications use this guide.
    Note that Netdata on Windows is at its final release stage, so at the next Netdata release Netdata will natively support Windows.
  • To monitor SNMP devices check this guide.

3. Configure Alert Notifications 🔔

Netdata comes with hundreds of pre-configured alerts, that automatically check your metrics, immediately after they start getting collected.

Netdata can dispatch alert notifications to multiple third party systems, including: email, Alerta, AWS SNS, Discord, Dynatrace, flock, gotify, IRC, Matrix, MessageBird, Microsoft Teams, ntfy, OPSgenie, PagerDuty, Prowl, PushBullet, PushOver, RocketChat, Slack, SMS tools, Syslog, Telegram, Twilio.

By default, Netdata will send e-mail notifications, if there is a configured MTA on the system.

4. Configure Netdata Parents 👪

Optionally, configure one or more Netdata Parents. A Netdata Parent is a Netdata Agent that has been configured to accept streaming connections from other Netdata agents.

Netdata Parents provide:

  • Infrastructure level dashboards, at http://parent.server.ip:19999/.

    Each Netdata Agent has an API listening at the TCP port 19999 of each server. When you hit that port with a web browser (e.g. http://server.ip:19999/), the Netdata Agent UI is presented. When the Netdata Agent is also a Parent, the UI of the Parent includes data for all nodes that stream metrics to that Parent.

  • Increased retention for all metrics of all your nodes.

    Each Netdata Agent maintains each own database of metrics. But Parents can be given additional resources to maintain a much longer database than individual Netdata Agents.

  • Central configuration of alerts and dispatch of notifications.

    Using Netdata Parents, all the alert notifications integrations can be configured only once, at the Parent and they can be disabled at the Netdata Agents.

You can also use Netdata Parents to:

  • Offload your production systems (the parents run ML, alerts, queries, etc. for all their children)
  • Secure your production systems (the parents accept user connections, for all their children)

5. Connect to Netdata Cloud

Sign-in to Netdata Cloud and claim your Netdata Agents and Parents. If you connect your Netdata Parents, there is no need to connect your Netdata Agents. They will be connected via the Parents.

When your Netdata nodes are connected to Netdata Cloud, you can (on top of the above):

  • Access your Netdata agents from anywhere
  • Access sensitive Netdata agent features (like "Netdata Functions": processes, systemd-journal)
  • Organize your infra in spaces and Rooms
  • Create, manage, and share custom dashboards
  • Invite your team and assign roles to them (Role Based Access Control - RBAC)
  • Get infinite horizontal scalability (multiple independent Netdata Agents are viewed as one infra)
  • Configure alerts from the UI
  • Configure data collection from the UI
  • Netdata Mobile App notifications

🤟 Netdata Cloud does not prevent you from using your Netdata Agents and Parents directly, and vice versa.

👌 Your metrics are still stored in your network when you connect your Netdata Agents and Parents to Netdata Cloud.


How it works

Netdata is built around a modular metrics processing pipeline.

Click to see more details about this pipeline...  

Each Netdata Agent can perform the following functions:

  1. COLLECT metrics from their sources
    Uses internal and external plugins to collect data from their sources.

    Netdata auto-detects and collects almost everything from the operating system: including CPU, Interrupts, Memory, Disks, Mount Points, Filesystems, Network Stack, Network Interfaces, Containers, VMs, Processes, systemd units, Linux Performance Metrics, Linux eBPF, Hardware Sensors, IPMI, and more.

    It collects application metrics from applications: PostgreSQL, MySQL/MariaDB, Redis, MongoDB, Nginx, Apache, and hundreds more.

    Netdata also collects your custom application metrics by scraping OpenMetrics exporters, or via StatsD.

    It can convert web server log files to metrics and apply ML and alerts to them, in real-time.

    And it also supports synthetic tests / white box tests, so you can ping servers, check API responses, or even check filesystem files and directories to generate metrics, train ML and run alerts and notifications on their status.

  2. STORE metrics to a database
    Uses database engine plugins to store the collected data, either in memory and/or on disk. We have developed our own dbengine for storing the data in a very efficient manner, allowing Netdata to have less than 1 byte per sample on disk and amazingly fast queries.

  3. LEARN the behavior of metrics (ML)
    Trains multiple Machine-Learning (ML) models per metric to learn the behavior of each metric individually. Netdata uses the kmeans algorithm and creates by default a model per metric per hour, based on the values collected for that metric over the last 6 hours. The trained models are persisted to disk.

  4. DETECT anomalies in metrics (ML)
    Uses the trained machine learning (ML) models to detect outliers and mark collected samples as anomalies. Netdata stores anomaly information together with each sample and also streams it to Netdata Parents so that the anomaly is also available at query time for the whole retention of each metric.

  5. CHECK metrics and trigger alert notifications
    Uses its configured alerts (you can configure your own) to check the metrics for common issues and uses notifications plugins to send alert notifications.

  6. STREAM metrics to other Netdata Agents
    Push metrics in real-time to Netdata Parents.

  7. ARCHIVE metrics to 3rd party databases
    Export metrics to industry standard time-series databases, like Prometheus, InfluxDB, OpenTSDB, Graphite, etc.

  8. QUERY metrics and present dashboards
    Provide an API to query the data and present interactive dashboards to users.

  9. SCORE metrics to reveal similarities and patterns
    Score the metrics according to the given criteria, to find the needle in the haystack.

When using Netdata Parents, all the functions of a Netdata Agent (except data collection) can be delegated to Parents to offload production systems.

The core of Netdata is developed in C. We have our own libnetdata, that provides:

  • DICTIONARY
    A high-performance algorithm to maintain both indexed and ordered pools of structures Netdata needs. It uses JudyHS arrays for indexing, although it is modular: any hashtable or tree can be integrated into it. Despite being in C, dictionaries follow object-oriented programming principles, so there are constructors, destructors, automatic memory management, garbage collection, and more. For more see here.

  • ARAL
    ARray ALlocator (ARAL) is used to minimize the system allocations made by Netdata. ARAL is optimized for maximum multithreaded performance. It also allows all structures that use it to be allocated in memory-mapped files (shared memory) instead of RAM. For more see here.

  • PROCFILE
    A high-performance /proc (but also any) file parser and text tokenizer. It achieves its performance by keeping files open and adjusting its buffers to read the entire file in one call (which is also required by the Linux kernel). For more see here.

  • STRING
    A string internet mechanism, for string deduplication and indexing (using JudyHS arrays), optimized for multithreaded usage. For more see here.

  • ARL
    Adaptive Resortable List (ARL), is a very fast list iterator, that keeps the expected items on the list in the same order they are found in input list. So, the first iteration is somewhat slower, but all the following iterations are perfectly aligned for best performance. For more see here.

  • BUFFER
    A flexible text buffer management system that allows Netdata to automatically handle dynamically sized text buffer allocations. The same mechanism is used for generating consistent JSON output by the Netdata APIs. For more see here.

  • SPINLOCK
    Like POSIX MUTEX and RWLOCK but a lot faster, based on atomic operations, with significantly smaller memory impact, while being portable.

  • PGC
    A caching layer that can be used to cache any kind of time-related data, with automatic indexing (based on a tree of JudyL arrays), memory management, evictions, flushing, pressure management. This is extensively used in dbengine. For more see here.

The above, and many more, allow Netdata developers to work on the application fast and with confidence. Most of the business logic in Netdata is a work of mixing the above.

Netdata data collection plugins can be developed in any language. Most of our application collectors though are developed in Go.

FAQ

🛡 Is Netdata secure?

Of course, it is! We do our best to ensure it is!

Click to see detailed answer ...  
 

We understand that Netdata is a software piece that is installed on millions of production systems across the world. So, it is important for us, Netdata to be as secure as possible:

 
 

🌀 Will Netdata consume significant resources on my servers?

No. It will not! We promise this will be fast!

Click to see detailed answer ...  
 

Although each Netdata Agent is a complete monitoring solution packed into a single application, and despite the fact that Netdata collects every metric every single second and trains multiple ML models per metric, you will find that Netdata has amazing performance! In many cases, it outperforms other monitoring solutions that have significantly fewer features or far smaller data collection rates.

This is what you should expect:

  • For production systems, each Netdata Agent with default settings (everything enabled, ML, Health, DB) should consume about 5% CPU utilization of one core and about 150 MiB or RAM.

    By using a Netdata parent and streaming all metrics to that parent, you can disable ML & health and use an ephemeral DB (like alloc) on the children, leading to utilization of about 1% CPU of a single core and 100 MiB of RAM. Of course, these depend on how many metrics are collected.

  • For Netdata Parents, for about 1 to 2 million metrics, all collected every second, we suggest a server with 16 cores and 32GB RAM. Less than half of it will be used for data collection and ML. The rest will be available for queries.

Netdata has extensive internal instrumentation to help us reveal how the resources consumed are used. All these are available in the "Netdata Monitoring" section of the dashboard. Depending on your use case, there are many options to optimize resource consumption.

Even if you need to run Netdata on extremely weak embedded or IoT systems, you will find that Netdata can be tuned to be very performant.

 
 

📜 How much retention can I have?

As much as you need!

Click to see detailed answer ...  
 

Netdata supports tiering, to downsample past data and save disk space. With default settings, it has 3 tiers:

  1. tier 0, with high resolution, per-second, data.
  2. tier 1, mid-resolution, per minute, data.
  3. tier 2, low-resolution, per hour, data.

All tiers are updated in parallel during data collection. Just increase the disk space you give to Netdata to get a longer history for your metrics. Tiers are automatically chosen at query time depending on the time frame and the resolution requested.

 
 

🚀 Does it scale? I have really a lot of servers!

Netdata is designed to scale and can handle large volumes of data.

Click to see detailed answer ...  
 
Netdata is a distributed monitoring solution. You can scale it to infinity by spreading Netdata Agents across your infrastructure.

With the streaming feature of the Agent, we can support monitoring ephemeral servers but also allow the creation of "monitoring islands" where metrics are aggregated to a few servers (Netdata Parents) for increased retention, or for offloading production systems.

  • Netdata Parents provide great vertical scalability, so you can have as big parents as the CPU, RAM and Disk resources you can dedicate to them. In our lab we constantly stress test Netdata Parents with several million metrics collected per second, to ensure it is reliable, stable, and robust at scale.

  • 🚀 In addition, Netdata Cloud provides virtually unlimited horizontal scalability. It "merges" all the Netdata parents you have into one unified infrastructure at query time. Netdata Cloud itself is probably the biggest single installation monitoring platform ever created, currently monitoring about 100k online servers with about 10k servers changing state (added/removed) per day!

Example: the following chart comes from a single Netdata Parent. As you can see on it, 244 nodes stream to it metrics of about 20k running containers. On this specific chart there are 3 dimensions per container, so a total of about 60k time-series queries are executed to present it.

image

 
 

💾 My production servers are very sensitive in disk I/O. Can I use Netdata?

Yes, you can!

Click to see detailed answer ...  
 

The Netdata Agent has been designed to spread disk writes across time. Each metric is flushed to disk every 17 minutes (1000 seconds), but metrics are flushed evenly across time, at an almost constant rate. Also, metrics are packed into bigger blocks we call extents and are compressed with ZSTD before saving them, to minimize the number of I/O operations made.

The Netdata Agent also employs direct I/O for all its database operations. By managing its own caches, Netdata avoids overburdening system caches, facilitating a harmonious coexistence with other applications.

Single node Agents (not Parents), should have a constant write rate of about 50 KiB/s or less, with some spikes above that every minute (flushing of tier 1) and higher spikes every hour (flushing of tier 2).

Health Alerts and Machine-Learning run queries to evaluate their expressions and learn from the metrics' patterns. These are also spread over time, so there should be an almost constant read rate too.

To make Netdata not use the disks at all, we suggest the following:

  1. Use database mode alloc or ram to disable writing metric data to disk.
  2. Configure streaming to push in real-time all metrics to a Netdata Parent. The Netdata Parent will maintain metrics on disk for this node.
  3. Disable ML and health on this node. The Netdata Parent will do them for this node.
  4. Use the Netdata Parent to access the dashboard.

Using the above, the Netdata Agent on your production system will not use a disk.

 
 

🤨 How is Netdata different from a Prometheus and Grafana setup?

Netdata is a "ready to use" monitoring solution. Prometheus and Grafana are tools to build your own monitoring solution.

Netdata is also a lot faster, requires significantly less resources and puts almost no stress on the server it runs. For a performance comparison check this blog.

Click to see detailed answer ...  
 

First, we have to say that Prometheus as a time-series database and Grafana as a visualizer are excellent tools for what they do.

However, we believe that such a setup is missing a key element: A Prometheus and Grafana setup assumes that you know everything about the metrics you collect and you understand deeply how they are structured, they should be queried and visualized.

In reality, this setup has a lot of problems. The vast number of technologies, operating systems, and applications we use in our modern stacks, makes it impossible for any single person to know and understand everything about anything. We get testimonials regularly from Netdata users across the biggest enterprises, that Netdata manages to reveal issues, anomalies and problems they were not aware of and they didn't even have the means to find or troubleshoot.

So, the biggest difference of Netdata to Prometheus, and Grafana, is that we decided that the tool needs to have a much better understanding of the components, the applications, and the metrics it monitors.

  • When compared to Prometheus, Netdata needs for each metric much more than just a name, some labels, and a value over time. A metric in Netdata is a structured entity that correlates with other metrics in a certain way and has specific attributes that depict how it should be organized, treated, queried, and visualized. We call this the NIDL (Nodes, Instances, Dimensions, Labels) framework.

    Maintaining such an index is a challenge: first, because the raw metrics collected do not provide this information, so we have to add it, and second because we need to maintain this index for the lifetime of each metric, which with our current database retention, it is usually more than a year.

    At the same time, Netdata provides better retention than Prometheus due to database tiering, scales easier than Prometheus due to streaming, supports anomaly detection and it has a metrics scoring engine to find the needle in the haystack when needed.

  • When compared to Grafana, Netdata is fully automated. Grafana has more customization capabilities than Netdata, but Netdata presents fully functional dashboards by itself and most importantly it gives you the means to understand, analyze, filter, slice and dice the data without the need for you to edit queries or be aware of any peculiarities the underlying metrics may have.

    Furthermore, to help you when you need to find the needle in the haystack, Netdata has advanced troubleshooting tools provided by the Netdata metrics scoring engine, that allows it to score metrics based on their anomaly rate, their differences or similarities for any given time frame.

Still, if you are already familiar with Prometheus and Grafana, Netdata integrates nicely with them, and we have reports from users who use Netdata with Prometheus and Grafana in production.

 
 

🤨 How is Netdata different from DataDog, New Relic, Dynatrace, X SaaS Provider?

With Netdata your data are always on-prem and your metrics are always high-resolution.

Click to see detailed answer ...  
 

Most commercial monitoring providers face a significant challenge: they centralize all metrics to their infrastructure and this is, inevitably, expensive. It leads them to one or more of the following:

  1. be unrealistically expensive
  2. limit the number of metrics they collect
  3. limit the resolution of the metrics they collect

As a result, they try to find a balance: collect the least possible data, but collect enough to have something useful out of it.

We, at Netdata, see monitoring in a completely different way: monitoring systems should be built bottom-up and be rich in insights, so we focus on each component individually to collect, store, check and visualize everything related to each of them, and we make sure that all components are monitored. Each metric is important.

This is why Netdata trains multiple machine-learning models per metric, based exclusively on their own past (no sampling of data, no sharing of trained models) to detect anomalies based on the specific use case and workload each component is used.

This is also why Netdata alerts are attached to components (instances) and are configured with dynamic thresholds and rolling windows, instead of static values.

The distributed nature of Netdata helps scale this approach: your data is spread inside your infrastructure, as close to the edge as possible. Netdata is not one data lane. Each Netdata Agent is a data lane and all of them together build a massive distributed metrics processing pipeline that ensures all your infrastructure components and applications are monitored and operating as they should.

 
 

🤨 How is Netdata different from Nagios, Icinga, Zabbix, etc.?

Netdata offers real-time, comprehensive monitoring and the ability to monitor everything, without any custom configuration required.

Click to see detailed answer ...  
 

While Nagios, Icinga, Zabbix, and other similar tools are powerful and highly customizable, they can be complex to set up and manage. Their flexibility often comes at the cost of ease-of-use, especially for users who are not systems administrators or do not have extensive experience with these tools. Additionally, these tools generally require you to know what you want to monitor in advance and configure it explicitly.

Netdata, on the other hand, takes a different approach. It provides a "ready to use" monitoring solution with a focus on simplicity and comprehensiveness. It automatically detects and starts monitoring many different system metrics and applications out-of-the-box, without any need for custom configuration.

In comparison to these traditional monitoring tools, Netdata:

  • Provides real-time, high-resolution metrics, as opposed to the often minute-level granularity that tools like Nagios, Icinga, and Zabbix provide.

  • Automatically generates meaningful, organized, and interactive visualizations of the collected data. Unlike other tools, where you have to manually create and organize graphs and dashboards, Netdata takes care of this for you.

  • Applies machine learning to each individual metric to detect anomalies, providing more insightful and relevant alerts than static thresholds.

  • Is designed to be distributed, so your data is spread inside your infrastructure, as close to the edge as possible. This approach is more scalable and avoids the potential bottleneck of a single centralized server.

  • Has a more modern and user-friendly interface, making it easy for anyone, not just experienced administrators, to understand the health and performance of their systems.

Even if you're already using Nagios, Icinga, Zabbix, or similar tools, you can use Netdata alongside them to augment your existing monitoring capabilities with real-time insights and user-friendly dashboards.

 
 

😳 I feel overwhelmed by the amount of information in Netdata. What should I do?

Netdata is designed to provide comprehensive insights, but we understand that the richness of information might sometimes feel overwhelming. Here are some tips on how to navigate and utilize Netdata effectively...

Click to see detailed answer ...  
 

Netdata is indeed a very comprehensive monitoring tool. It's designed to provide you with as much information as possible about your system and applications, so that you can understand and address any issues that arise. However, we understand that the sheer amount of data can sometimes be overwhelming.

Here are some suggestions on how to manage and navigate this wealth of information:

  1. Start with the Metrics Dashboard
    Netdata's Metrics Dashboard provides a high-level summary of your system's status. We have added summary tiles on almost every section, you reveal the information that is more important. This is a great place to start, as it can help you identify any major issues or trends at a glance.

  2. Use the Search Feature
    If you're looking for specific information, you can use the search feature to find the relevant metrics or charts. This can help you avoid scrolling through all the data.

  3. Customize your Dashboards
    Netdata allows you to create custom dashboards, which can help you focus on the metrics that are most important to you. Sign-in to Netdata and there you can have your custom dashboards. (coming soon to the agent dashboard too)

  4. Leverage Netdata's Anomaly Detection
    Netdata uses machine learning to detect anomalies in your metrics. This can help you identify potential issues before they become major problems. We have added an AR button above the dashboard table of contents to reveal the anomaly rate per section so that you can easily spot what could need your attention.

  5. Take Advantage of Netdata's Documentation and Blogs
    Netdata has extensive documentation that can help you understand the different metrics and how to interpret them. You can also find tutorials, guides, and best practices there.

Remember, it's not necessary to understand every single metric or chart right away. Netdata is a powerful tool, and it can take some time to fully explore and understand all of its features. Start with the basics and gradually delve into more complex metrics as you become more comfortable with the tool.

 
 

Do I have to subscribe to Netdata Cloud?

Netdata Cloud delivers the full suite of features and functionality that Netdata offers, including a free community tier.

While our default onboarding process encourages users to take advantage of Netdata Cloud, including a complimentary one-month trial of our full business product, it is not mandatory. Users have the option to bypass this process entirely and still utilize the Netdata Agents along with the Netdata UI, without the need to sign up for Netdata Cloud.

Click to see detailed answer ...  
 

The Netdata Agent dashboard and the Netdata Cloud dashboard are the same. Still, Netdata Cloud provides additional features, that the Netdata Agent is not capable of. These include:

  1. Access your infrastructure from anywhere.
  2. Have SSO to protect sensitive features.
  3. Customizable (custom dashboards and other settings are persisted when you are signed in to Netdata Cloud)
  4. Configuration of Alerts and Data Collection from the UI
  5. Security (role-based access control - RBAC).
  6. Horizontal Scalability ("blend" multiple independent parents in one uniform infrastructure)
  7. Central Dispatch of Alert Notifications (even when multiple independent parents are involved)
  8. Mobile App for Alert Notifications

We encourage you to support Netdata by buying a Netdata Cloud subscription. A successful Netdata is a Netdata that evolves and gets improved to provide a simpler, faster and easier monitoring for all of us.

For organizations that need a fully on-prem solution, we provide Netdata Cloud for on-prem installation. Contact us for more information.

 
 

🔎 What does the anonymous telemetry collected by Netdata entail?

Your privacy is our utmost priority. As part of our commitment to improving Netdata, we rely on anonymous telemetry data from our users who choose to leave it enabled. This data greatly informs our decision-making processes and contributes to the future evolution of Netdata.

Should you wish to disable telemetry, instructions for doing so are provided in our installation guides.

Click to see detailed answer ...  
 

Netdata is in a constant state of growth and evolution. The decisions that guide this development are ideally rooted in data. By analyzing anonymous telemetry data, we can answer questions such as: "What features are being used frequently?", "How do we prioritize between potential new features?" and "What elements of Netdata are most important to our users?"

By leaving anonymous telemetry enabled, users indirectly contribute to shaping Netdata's roadmap, providing invaluable information that helps us prioritize our efforts for the project and the community.

We are aware that for privacy or regulatory reasons, not all environments can allow telemetry. To cater to this, we have simplified the process of disabling telemetry:

  • During installation, you can append --disable-telemetry to our kickstart.sh script, or
  • Create the file /etc/netdata/.opt-out-from-anonymous-statistics and then restart Netdata.

These steps will disable the anonymous telemetry for your Netdata installation.

Please note, even with telemetry disabled, Netdata still requires a Netdata Registry for alert notifications' Call To Action (CTA) functionality. When you click an alert notification, it redirects you to the Netdata Registry, which then directs your web browser to the specific Netdata Agent that issued the alert for further troubleshooting. The Netdata Registry learns the URLs of your agents when you visit their dashboards.

Any Netdata Agent can act as a Netdata Registry. Simply designate one Netdata Agent as your registry, and our global Netdata Registry will no longer be in use. For further information on this, please refer to this guide.

 
 

😏 Who uses Netdata?

Netdata is a widely adopted project...

Click to see detailed answer ...  
 

Browse the Netdata stargazers on GitHub to discover users from renowned companies and enterprises, such as ABN AMRO Bank, AMD, Amazon, Baidu, Booking.com, Cisco, Delta, Facebook, Google, IBM, Intel, Logitech, Netflix, Nokia, Qualcomm, Realtek Semiconductor Corp, Redhat, Riot Games, SAP, Samsung, Unity, Valve, and many others.

Netdata also enjoys significant usage in academia, with notable institutions including New York University, Columbia University, New Jersey University, Seoul National University, University College London, among several others.

And, Netdata is also used by numerous governmental organizations worldwide.

In a nutshell, Netdata proves invaluable for:

  • Infrastructure intensive organizations
    Such as hosting/cloud providers and companies with hundreds or thousands of nodes, who require a high-resolution, real-time monitoring solution for a comprehensive view of all their components and applications.

  • Technology operators
    Those in need of a standardized, comprehensive solution for round-the-clock operations. Netdata not only facilitates operational automation and provides controlled access for their operations engineers, but also enhances skill development over time.

  • Technology startups
    Who seek a feature-rich monitoring solution from the get-go.

  • Freelancers
    Who seek a simple, efficient and straightforward solution without sacrificing performance and outcomes.

  • Professional SysAdmins and DevOps
    Who appreciate the fine details and understand the value of holistic monitoring from the ground up.

  • Everyone else
    All of us, who are tired of the inefficiency in the monitoring industry and would love a refreshing change and a breath of fresh air. 🙂

 
 

🌐 Is Netdata open-source?

The Netdata Agent is open-source, but the overall Netdata ecosystem is a hybrid solution, combining open-source and closed-source components.

Click to see detailed answer ...  
 

Open-source is about sharing intellectual property with the world, and at Netdata, we embrace this philosophy wholeheartedly.

The Netdata Agent, the core of our ecosystem and the engine behind all our observability features, is fully open-source. Licensed under GPLv3+, the Netdata Agent represents our commitment to open-sourcing innovation in a wide array of observability technologies, including data collection, database design, query engines, observability data modeling, machine learning and unsupervised anomaly detection, high-performance edge computing, real-time monitoring, and more.

The Netdata Agent is our gift to the world, ensuring that the cutting-edge advancements we've developed are freely accessible to everyone.

However, as a privately funded company, we also need to monetize our open-source software to demonstrate product-market fit and sustain our growth.

Traditionally, open-source projects have often used the open-core model, where a basic version of the software is open-source, and additional features are reserved for a commercial, closed-source version. This approach can limit access to advanced innovations, as most of these remain closed-source.

At Netdata, we take a slightly different path. We don't create a separate enterprise version of our product. Instead, all users—whether commercial or not—utilize the same Netdata Agent, ensuring that all our observability innovations are always open-source.

To experience the full capabilities of the Netdata ecosystem, users need to combine the open-source components with our closed-source offerings. The complete product still remains free to use.

The closed-source components include:

  • Netdata UI: This is closed-source but free to use with the Netdata Agents and Netdata Cloud. It’s also publicly available via a CDN.
  • Netdata Cloud: A commercial product available both as an on-premises installation and as a SaaS solution, with a free community tier.

By balancing open-source and closed-source components, we ensure that all users have access to our innovations while sustaining our ability to grow and innovate as a company.

 
 

💰 What is your monetization strategy?

Netdata generates revenue through subscriptions to advanced features of Netdata Cloud and sales of on-premise and private versions of Netdata Cloud.

Click to see detailed answer ...  
 

Netdata generates revenue from these activities:

  1. Netdata Cloud Subscriptions
    Direct funding for our project's vision comes from users subscribing to Netdata Cloud's advanced features.

  2. Netdata Cloud On-Prem or Private
    Purchasing the on-premises or private versions of Netdata Cloud supports our financial growth.

Our Open-Source Community and the free access to Netdata Cloud, contribute to Netdata in the following ways:

  • Netdata Cloud Community Use
    The free usage of Netdata Cloud demonstrates its market relevance. While this doesn't generate revenue, it reinforces trust among new users and aids in securing appropriate project funding.

  • User Feedback
    Feedback, especially issues and bug reports, is invaluable. It steers us towards a more resilient and efficient product. This, too, isn't a revenue source but is pivotal for our project's evolution.

  • Anonymous Telemetry Insights
    Users who keep anonymous telemetry enabled, help us make data informed decisions in refining and enhancing Netdata. This isn't a revenue stream, but knowing which features are used and how, contributes in building a better product for everyone.

We don't monetize, directly or indirectly, users' or "device heuristics" data. Any data collected from community members are exclusively used for the purposes stated above.

Netdata grows financially when technology intensive organizations and operators, need - due to regulatory or business requirements - the entire Netdata suite on-prem or private, bundled with top-tier support. It is a win-win case for all parties involved: these companies get a battle tested, robust and reliable solution, while the broader community that helps us build this product, enjoys it at no cost.

 
 

📖 Documentation

Netdata's documentation is available at Netdata Learn.

This site also hosts a number of guides to help newer users better understand how to collect metrics, troubleshoot via charts, export to external databases, and more.

🎉 Community

Discord Discourse topics GitHub Discussions

Netdata is an inclusive open-source project and community. Please read our Code of Conduct.

Join the Netdata community:

Meet Up 🧑🤝🧑🧑🤝🧑🧑🤝🧑
The Netdata team and community members have regular online meetups.
You are welcome to join us! Click here for the schedule.

You can also find Netdata on:
Twitter | YouTube | Reddit | LinkedIn | StackShare | Product Hunt | Repology | Facebook

🙏 Contribute

Open Source Contributors

Contributions are essential to the success of open-source projects. In other words, we need your help to keep Netdata great!

What is a contribution? All the following are highly valuable to Netdata:

  1. Let us know of the best-practices you believe should be standardized
    Netdata should out-of-the-box detect as many infrastructure issues as possible. By sharing your knowledge and experiences, you help us build a monitoring solution that has baked into it all the best-practices about infrastructure monitoring.

  2. Let us know if Netdata is not perfect for your use case
    We aim to support as many use cases as possible and your feedback can be invaluable. Open a GitHub issue, or start a GitHub discussion about it, to discuss how you want to use Netdata and what you need.

    Although we can't implement everything imaginable, we try to prioritize development on use-cases that are common to our community, are in the same direction we want Netdata to evolve and are aligned with our roadmap.

  3. Support other community members
    Join our community on GitHub, Discord and Reddit. Generally, Netdata is relatively easy to set up and configure, but still people may need a little push in the right direction to use it effectively. Supporting other members is a great contribution by itself!

  4. Add or improve integrations you need
    Integrations tend to be easier and simpler to develop. If you would like to contribute your code to Netdata, we suggest that you start with the integrations you need, which Netdata does not currently support.

General information about contributions:

  • Check our Security Policy.
  • Found a bug? Open a GitHub issue.
  • Read our Contributing Guide, which contains all the information you need to contribute to Netdata, such as improving our documentation, engaging in the community, and developing new features. We've made it as frictionless as possible, but if you need help, just ping us on our community forums!

Package maintainers should read the guide on building Netdata from source for instructions on building each Netdata component from the source and preparing a package.

License

The Netdata ecosystem is comprised of three key components:

  • Netdata Agent: The heart of the Netdata ecosystem, the Netdata Agent is an open-source tool that must be installed on all systems monitored by Netdata. It offers a wide range of essential features, including data collection via various plugins, an embedded high-performance time-series database (dbengine), unsupervised anomaly detection powered by edge-trained machine learning, alerting and notifications, as well as query and scoring engines with associated APIs. Additionally, it supports exporting data to third-party monitoring systems, among other capabilities.

    The Netdata Agent is released under the GPLv3+ license and redistributes several other open-source tools and libraries, which are listed in the Netdata Agent third-party licenses.

  • Netdata Cloud: A commercial, closed-source component, Netdata Cloud enhances the capabilities of the open-source Netdata Agent by providing horizontal scalability, centralized alert notification dispatch (including a mobile app), user management, role-based access control, and other enterprise-grade features. It is available both as a SaaS solution and for on-premises deployment, with a free-to-use community tier also offered.

  • Netdata UI: The Netdata UI is closed-source, and handles all visualization and dashboard functionalities related to metrics, logs and other collected data, as well as the central configuration and management of the Netdata ecosystem. It serves both the Netdata Agent and Netdata Cloud. The Netdata UI is distributed in binary form with the Netdata Agent and is publicly accessible via a CDN, licensed under the Netdata Cloud UI License 1 (NCUL1). It integrates third-party open-source components, detailed in the Netdata UI third-party licenses.

The binary installation packages provided by Netdata include the Netdata Agent and the Netdata UI. Since the Netdata Agent is open-source, it is frequently packaged by third parties (e.g. Linux Distributions) excluding the closed-source components (Netdata UI is not included). While their packages can still be useful in providing the necessary back-ends and the APIs of a fully functional monitoring solution, we recommend using the installation packages we provide to experience the full feature set of Netdata.

December 31, 1969  23:59:59

🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper


Crawl4AI (Async Version) 🕷️🤖

GitHub Stars GitHub Forks GitHub Issues GitHub Pull Requests License

Crawl4AI simplifies asynchronous web crawling and data extraction, making it accessible for large language models (LLMs) and AI applications. 🆓🌐

Looking for the synchronous version? Check out README.sync.md. You can also access the previous version in the branch V0.2.76.

Try it Now!

✨ Play around with this Open In Colab

✨ Visit our Documentation Website

Features ✨

  • 🆓 Completely free and open-source
  • 🚀 Blazing fast performance, outperforming many paid services
  • 🤖 LLM-friendly output formats (JSON, cleaned HTML, markdown)
  • 🌍 Supports crawling multiple URLs simultaneously
  • 🎨 Extracts and returns all media tags (Images, Audio, and Video)
  • 🔗 Extracts all external and internal links
  • 📚 Extracts metadata from the page
  • 🔄 Custom hooks for authentication, headers, and page modifications before crawling
  • 🕵️ User-agent customization
  • 🖼️ Takes screenshots of the page
  • 📜 Executes multiple custom JavaScripts before crawling
  • 📊 Generates structured output without LLM using JsonCssExtractionStrategy
  • 📚 Various chunking strategies: topic-based, regex, sentence, and more
  • 🧠 Advanced extraction strategies: cosine clustering, LLM, and more
  • 🎯 CSS selector support for precise data extraction
  • 📝 Passes instructions/keywords to refine extraction
  • 🔒 Proxy support for enhanced privacy and access
  • 🔄 Session management for complex multi-page crawling scenarios
  • 🌐 Asynchronous architecture for improved performance and scalability

Installation 🛠️

Crawl4AI offers flexible installation options to suit various use cases. You can install it as a Python package or use Docker.

Using pip 🐍

Choose the installation option that best fits your needs:

Basic Installation

For basic web crawling and scraping tasks:

pip install crawl4ai

By default, this will install the asynchronous version of Crawl4AI, using Playwright for web crawling.

👉 Note: When you install Crawl4AI, the setup script should automatically install and set up Playwright. However, if you encounter any Playwright-related errors, you can manually install it using one of these methods:

  1. Through the command line:

    playwright install
    
  2. If the above doesn't work, try this more specific command:

    python -m playwright install chromium
    

This second method has proven to be more reliable in some cases.

Installation with Synchronous Version

If you need the synchronous version using Selenium:

pip install crawl4ai[sync]

Development Installation

For contributors who plan to modify the source code:

git clone https://github.com/unclecode/crawl4ai.git
cd crawl4ai
pip install -e .

Using Docker 🐳

We're in the process of creating Docker images and pushing them to Docker Hub. This will provide an easy way to run Crawl4AI in a containerized environment. Stay tuned for updates!

For more detailed installation instructions and options, please refer to our Installation Guide.

Quick Start 🚀

import asyncio
from crawl4ai import AsyncWebCrawler

async def main():
    async with AsyncWebCrawler(verbose=True) as crawler:
        result = await crawler.arun(url="https://www.nbcnews.com/business")
        print(result.markdown)

if __name__ == "__main__":
    asyncio.run(main())

Advanced Usage 🔬

Executing JavaScript and Using CSS Selectors

import asyncio
from crawl4ai import AsyncWebCrawler

async def main():
    async with AsyncWebCrawler(verbose=True) as crawler:
        js_code = ["const loadMoreButton = Array.from(document.querySelectorAll('button')).find(button => button.textContent.includes('Load More')); loadMoreButton && loadMoreButton.click();"]
        result = await crawler.arun(
            url="https://www.nbcnews.com/business",
            js_code=js_code,
            css_selector="article.tease-card",
            bypass_cache=True
        )
        print(result.extracted_content)

if __name__ == "__main__":
    asyncio.run(main())

Using a Proxy

import asyncio
from crawl4ai import AsyncWebCrawler

async def main():
    async with AsyncWebCrawler(verbose=True, proxy="http://127.0.0.1:7890") as crawler:
        result = await crawler.arun(
            url="https://www.nbcnews.com/business",
            bypass_cache=True
        )
        print(result.markdown)

if __name__ == "__main__":
    asyncio.run(main())

Extracting Structured Data without LLM

The JsonCssExtractionStrategy allows for precise extraction of structured data from web pages using CSS selectors.

import asyncio
import json
from crawl4ai import AsyncWebCrawler
from crawl4ai.extraction_strategy import JsonCssExtractionStrategy

async def extract_news_teasers():
    schema = {
        "name": "News Teaser Extractor",
        "baseSelector": ".wide-tease-item__wrapper",
        "fields": [
            {
                "name": "category",
                "selector": ".unibrow span[data-testid='unibrow-text']",
                "type": "text",
            },
            {
                "name": "headline",
                "selector": ".wide-tease-item__headline",
                "type": "text",
            },
            {
                "name": "summary",
                "selector": ".wide-tease-item__description",
                "type": "text",
            },
            {
                "name": "time",
                "selector": "[data-testid='wide-tease-date']",
                "type": "text",
            },
            {
                "name": "image",
                "type": "nested",
                "selector": "picture.teasePicture img",
                "fields": [
                    {"name": "src", "type": "attribute", "attribute": "src"},
                    {"name": "alt", "type": "attribute", "attribute": "alt"},
                ],
            },
            {
                "name": "link",
                "selector": "a[href]",
                "type": "attribute",
                "attribute": "href",
            },
        ],
    }

    extraction_strategy = JsonCssExtractionStrategy(schema, verbose=True)

    async with AsyncWebCrawler(verbose=True) as crawler:
        result = await crawler.arun(
            url="https://www.nbcnews.com/business",
            extraction_strategy=extraction_strategy,
            bypass_cache=True,
        )

        assert result.success, "Failed to crawl the page"

        news_teasers = json.loads(result.extracted_content)
        print(f"Successfully extracted {len(news_teasers)} news teasers")
        print(json.dumps(news_teasers[0], indent=2))

if __name__ == "__main__":
    asyncio.run(extract_news_teasers())

For more advanced usage examples, check out our Examples section in the documentation.

Extracting Structured Data with OpenAI

import os
import asyncio
from crawl4ai import AsyncWebCrawler
from crawl4ai.extraction_strategy import LLMExtractionStrategy
from pydantic import BaseModel, Field

class OpenAIModelFee(BaseModel):
    model_name: str = Field(..., description="Name of the OpenAI model.")
    input_fee: str = Field(..., description="Fee for input token for the OpenAI model.")
    output_fee: str = Field(..., description="Fee for output token for the OpenAI model.")

async def main():
    async with AsyncWebCrawler(verbose=True) as crawler:
        result = await crawler.arun(
            url='https://openai.com/api/pricing/',
            word_count_threshold=1,
            extraction_strategy=LLMExtractionStrategy(
                provider="openai/gpt-4o", api_token=os.getenv('OPENAI_API_KEY'), 
                schema=OpenAIModelFee.schema(),
                extraction_type="schema",
                instruction="""From the crawled content, extract all mentioned model names along with their fees for input and output tokens. 
                Do not miss any models in the entire content. One extracted model JSON format should look like this: 
                {"model_name": "GPT-4", "input_fee": "US$10.00 / 1M tokens", "output_fee": "US$30.00 / 1M tokens"}."""
            ),            
            bypass_cache=True,
        )
        print(result.extracted_content)

if __name__ == "__main__":
    asyncio.run(main())

Session Management and Dynamic Content Crawling

Crawl4AI excels at handling complex scenarios, such as crawling multiple pages with dynamic content loaded via JavaScript. Here's an example of crawling GitHub commits across multiple pages:

import asyncio
import re
from bs4 import BeautifulSoup
from crawl4ai import AsyncWebCrawler

async def crawl_typescript_commits():
    first_commit = ""
    async def on_execution_started(page):
        nonlocal first_commit 
        try:
            while True:
                await page.wait_for_selector('li.Box-sc-g0xbh4-0 h4')
                commit = await page.query_selector('li.Box-sc-g0xbh4-0 h4')
                commit = await commit.evaluate('(element) => element.textContent')
                commit = re.sub(r'\s+', '', commit)
                if commit and commit != first_commit:
                    first_commit = commit
                    break
                await asyncio.sleep(0.5)
        except Exception as e:
            print(f"Warning: New content didn't appear after JavaScript execution: {e}")

    async with AsyncWebCrawler(verbose=True) as crawler:
        crawler.crawler_strategy.set_hook('on_execution_started', on_execution_started)

        url = "https://github.com/microsoft/TypeScript/commits/main"
        session_id = "typescript_commits_session"
        all_commits = []

        js_next_page = """
        const button = document.querySelector('a[data-testid="pagination-next-button"]');
        if (button) button.click();
        """

        for page in range(3):  # Crawl 3 pages
            result = await crawler.arun(
                url=url,
                session_id=session_id,
                css_selector="li.Box-sc-g0xbh4-0",
                js=js_next_page if page > 0 else None,
                bypass_cache=True,
                js_only=page > 0
            )

            assert result.success, f"Failed to crawl page {page + 1}"

            soup = BeautifulSoup(result.cleaned_html, 'html.parser')
            commits = soup.select("li")
            all_commits.extend(commits)

            print(f"Page {page + 1}: Found {len(commits)} commits")

        await crawler.crawler_strategy.kill_session(session_id)
        print(f"Successfully crawled {len(all_commits)} commits across 3 pages")

if __name__ == "__main__":
    asyncio.run(crawl_typescript_commits())

This example demonstrates Crawl4AI's ability to handle complex scenarios where content is loaded asynchronously. It crawls multiple pages of GitHub commits, executing JavaScript to load new content and using custom hooks to ensure data is loaded before proceeding.

For more advanced usage examples, check out our Examples section in the documentation.

Speed Comparison 🚀

Crawl4AI is designed with speed as a primary focus. Our goal is to provide the fastest possible response with high-quality data extraction, minimizing abstractions between the data and the user.

We've conducted a speed comparison between Crawl4AI and Firecrawl, a paid service. The results demonstrate Crawl4AI's superior performance:

Firecrawl:
Time taken: 7.02 seconds
Content length: 42074 characters
Images found: 49

Crawl4AI (simple crawl):
Time taken: 1.60 seconds
Content length: 18238 characters
Images found: 49

Crawl4AI (with JavaScript execution):
Time taken: 4.64 seconds
Content length: 40869 characters
Images found: 89

As you can see, Crawl4AI outperforms Firecrawl significantly:

  • Simple crawl: Crawl4AI is over 4 times faster than Firecrawl.
  • With JavaScript execution: Even when executing JavaScript to load more content (doubling the number of images found), Crawl4AI is still faster than Firecrawl's simple crawl.

You can find the full comparison code in our repository at docs/examples/crawl4ai_vs_firecrawl.py.

Documentation 📚

For detailed documentation, including installation instructions, advanced features, and API reference, visit our Documentation Website.

Contributing 🤝

We welcome contributions from the open-source community. Check out our contribution guidelines for more information.

License 📄

Crawl4AI is released under the Apache 2.0 License.

Contact 📧

For questions, suggestions, or feedback, feel free to reach out:

Happy Crawling! 🕸️🚀

Star History

Star History Chart

December 31, 1969  23:59:59

The OS for your personal finances


dashboard_mockup (Note: The image above is a mockup of what we're working towards. We're rapidly approaching the functionality shown, but not all of the parts are ready just yet.)

Maybe: The OS for your personal finances

Get involved: DiscordWebsiteIssues

If you're looking for the previous React codebase, you can find it at maybe-finance/maybe-archive.

Backstory

We spent the better part of 2021/2022 building a personal finance + wealth management app called, Maybe. Very full-featured, including an "Ask an Advisor" feature which connected users with an actual CFP/CFA to help them with their finances (all included in your subscription).

The business end of things didn't work out, and so we shut things down mid-2023.

We spent the better part of $1,000,000 building the app (employees + contractors, data providers/services, infrastructure, etc.).

We're now reviving the product as a fully open-source project. The goal is to let you run the app yourself, for free, and use it to manage your own finances and eventually offer a hosted version of the app for a small monthly fee.

Maybe Hosting

There are 3 primary ways to use the Maybe app:

  1. Managed (easiest) - coming soon...
  2. One-click deploy
  3. Self-host with Docker

Local Development Setup

If you are trying to self-host the Maybe app, stop here. You should read this guide to get started.

The instructions below are for developers to get started with contributing to the app.

Requirements

  • Ruby 3.3.4
  • PostgreSQL >9.3 (ideally, latest stable version)

After cloning the repo, the basic setup commands are:

cd maybe
cp .env.example .env
bin/setup
bin/dev

# Optionally, load demo data
rake demo_data:reset

And visit http://localhost:3000 to see the app. You can use the following credentials to log in (generated by DB seed):

For further instructions, see guides below.

Multi-currency support

If you'd like multi-currency support, there are a few extra steps to follow.

  1. Sign up for an API key at Synth. It's a Maybe product and the free plan is sufficient for basic multi-currency support.
  2. Add your API key to your .env file.

Setup Guides

Dev Container (optional)

This is 100% optional and meant for devs who don't want to worry about installing requirements manually for their platform. You can follow this guide to learn more about Dev Containers.

If you run into could not connect to server errors, you may need to change your .env's DB_HOST environment variable value to db to point to the Postgres container.

Mac

Please visit our Mac dev setup guide.

Linux

Please visit our Linux dev setup guide.

Windows

Please visit our Windows dev setup guide.

Testing Emails

In development, we use letter_opener to automatically open emails in your browser. When an email sends locally, a new browser tab will open with a preview.

Contributing

Before contributing, you'll likely find it helpful to understand context and general vision/direction.

Once you've done that, please visit our contributing guide to get started!

Repo Activity

Repo Activity

Copyright & license

Maybe is distributed under an AGPLv3 license. " Maybe" is a trademark of Maybe Finance, Inc.

December 31, 1969  23:59:59

🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用


English | 简体中文 | 日本語

Pake

Turn any webpage into a desktop app with Rust with ease.

Pake supports Mac, Windows, and Linux. Check out README for Popular Packages, Command-Line Packaging, and Customized Development information. Feel free to share your suggestions in Discussions.

Features

  • 🎐 Nearly 20 times smaller than an Electron package (around 5M!)
  • 🚀 With Rust Tauri, Pake is much more lightweight and faster than JS-based frameworks.
  • 📦 Battery-included package — shortcut pass-through, immersive windows, and minimalist customization.
  • 👻 Pake is just a simple tool — replace the old bundle approach with Tauri (though PWA is good enough).

Popular Packages

WeRead Mac Windows Linux Twitter Mac Windows Linux
ChatGPT Mac Windows Linux Poe Mac Windows Linux
YouTube Music Mac Windows Linux YouTube Mac Windows Linux
LiZhi Mac Windows Linux ProgramMusic Mac Windows Linux
Qwerty Mac Windows Linux CodeRunner Mac Windows Linux
Flomo Mac Windows Linux XiaoHongShu Mac Windows Linux
🏂 You can download more applications from Releases. Click here to expand the shortcuts reference!
Mac Windows/Linux Function
+ [ Ctrl + Return to the previous page
+ ] Ctrl + Go to the next page
+ Ctrl + Auto scroll to top of page
+ Ctrl + Auto scroll to bottom of page
+ r Ctrl + r Refresh Page
+ w Ctrl + w Hide window, not quite
+ - Ctrl + - Zoom out the page
+ + Ctrl + + Zoom in the page
+ = Ctrl + = Zoom in the Page
+ 0 Ctrl + 0 Reset the page zoom

In addition, double-click the title bar to switch to full-screen mode. For Mac users, you can also use the gesture to go to the previous or next page and drag the title bar to move the window.

Before starting

  1. For beginners: Play with Popular Packages to find out Pake's capabilities, or try to pack your application with GitHub Actions. Don't hesitate to reach for assistance at Discussion!
  2. For developers: “Command-Line Packaging” supports macOS fully. For Windows/Linux users, it requires some tinkering. Configure your environment before getting started.
  3. For hackers: For people who are good at both front-end development and Rust, how about customizing your apps' function more with the following Customized Development?

Command-Line Packaging

Pake

Pake provides a command line tool, making the flow of package customization quicker and easier. See documentation for more information.

# Install with npm
npm install -g pake-cli

# Command usage
pake url [OPTIONS]...

# Feel free to play with Pake! It might take a while to prepare the environment the first time you launch Pake.
pake https://weekly.tw93.fun --name Weekly --hide-title-bar

If you are new to the command line, you can compile packages online with GitHub Actions. See the Tutorial for more information.

Development

Prepare your environment before starting. Make sure you have Rust >=1.63 and Node >=16 (e.g., 16.18.1) installed on your computer. For installation guidance, see Tauri documentation.

If you are unfamiliar with these, it is better to try out the above tool to pack with one click.

# Install Dependencies
npm i

# Local development [Right-click to open debug mode.]
npm run dev

# Pack application
npm run build

Advanced Usage

  1. You can refer to the codebase structure before working on Pake, which will help you much in development.
  2. Modify the url and productName fields in the pake.json file under the src-tauri directory, the "domain" field in the tauri.config.json file needs to be modified synchronously, as well as the icon and identifier fields in the tauri.xxx.conf.json file. You can select an icon from the icons directory or download one from macOSicons to match your product needs.
  3. For configurations on window properties, you can modify the pake.json file to change the value of width, height, fullscreen (or not), resizable (or not) of the windows property. To adapt to the immersive header on Mac, change hideTitleBar to true, look for the Header element, and add the padding-top property.
  4. For advanced usages such as style rewriting, advertisement removal, JS injection, container message communication, and user-defined shortcut keys, see Advanced Usage of Pake.

Developers

Pake's development can not be without these Hackers. They contributed a lot of capabilities for Pake. Also, welcome to follow them! ❤️

tw93
Tw93
Tlntin
Tlntin
jeasonnow
Santree
pan93412
Pan93412
wanghanzhen
Volare
liby
Bryan Lee
essesoul
Essesoul
YangguangZhou
Jerry Zhou
AielloChan
Aiello
m1911star
Horus
Pake-Actions
Pake Actions
eltociear
Ikko Eltociear Ashimine
QingZ11
Steam
exposir
孟世博
2nthony
2nthony
ACGNnsj
Null
imabutahersiddik
Abu Taher Siddik
kidylee
An Li
nekomeowww
Ayaka Neko
turkyden
Dengju Deng
Fechin
Fechin
ImgBotApp
Imgbot
droid-Q
Jiaqi Gu
mattbajorek
Matt Bajorek
Milo123459
Milo
princemaple
Po Chen
Tianj0o
Qitianjia
geekvest
Null
houhoz
Hyzhao
lakca
Null
liudonghua123
Liudonghua
liusishan
Liusishan
piaoyidage
Ranger
hetz
贺天卓

Frequently Asked Questions

  1. Right-clicking on an image element in the page to open the menu and select download image or other events does not work (common in MacOS systems). This issue is due to the MacOS built-in webview not supporting this feature.

Support

  1. I have two cats, TangYuan and Coke. If you think Pake delights your life, you can feed them some canned food 🥩.
  2. If you like Pake, you can star it on GitHub. Also, welcome to recommend Pake to your friends.
  3. You can follow my Twitter to get the latest news of Pake or join our Telegram chat group.
  4. I hope that you enjoy playing with it. Let us know if you find a website that would be great for a Mac App!
December 31, 1969  23:59:59

The first real AI developer


🧑‍✈️ GPT PILOT 🧑‍✈️


Discord Follow GitHub Repo stars Twitter Follow


Pythagora-io%2Fgpt-pilot | Trendshift

Pythagora-io%2Fgpt-pilot | Trendshift


GPT Pilot doesn't just generate code, it builds apps!


See it in action

(click to open the video in YouTube) (1:40min)


📫 If you would like to get updates on future releases or just get in touch, join our Discord server or you can add your email here. 📬



GPT Pilot aims to research how much LLMs can be utilized to generate fully working, production-ready apps while the developer oversees the implementation.

The main idea is that AI can write most of the code for an app (maybe 95%), but for the rest, 5%, a developer is and will be needed until we get full AGI.

If you are interested in our learnings during this project, you can check our latest blog posts.





🔌 Requirements

  • Python 3.9+

🚦How to start using gpt-pilot?

If you're new to GPT Pilot:

After you have Python and (optionally) PostgreSQL installed, follow these steps:

  1. git clone https://github.com/Pythagora-io/gpt-pilot.git (clone the repo)
  2. cd gpt-pilot (go to the repo folder)
  3. python3 -m venv venv (create a virtual environment)
  4. source venv/bin/activate (or on Windows venv\Scripts\activate) (activate the virtual environment)
  5. pip install -r requirements.txt (install the dependencies)
  6. cp example-config.json config.json (create config.json file)
  7. Set your key and other settings in config.json file:
    • LLM Provider (openai, anthropic or groq) key and endpoints (leave null for default) (note that Azure and OpenRouter are suppored via the openai setting)
    • Your API key (if null, will be read from the environment variables)
    • database settings: sqlite is used by default, PostgreSQL should also work
    • optionally update fs.ignore_paths and add files or folders which shouldn't be tracked by GPT Pilot in workspace, useful to ignore folders created by compilers
  8. python main.py (start GPT Pilot)

All generated code will be stored in the folder workspace inside the folder named after the app name you enter upon starting the pilot.

If you're upgrading from GPT Pilot v0.1

Assuming you already have the git repository with an earlier version:

  1. git pull (update the repo)
  2. source pilot-env/bin/activate (or on Windows pilot-env\Scripts\activate) (activate the virtual environment)
  3. pip install -r requirements.txt (install the new dependencies)
  4. python main.py --import-v0 pilot/gpt-pilot (this should import your settings and existing projects)

This will create a new database pythagora.db and import all apps from the old database. For each app, it will import the start of the latest task you were working on.

To verify that the import was successful, you can run python main.py --list to see all the apps you have created, and check config.json to check the settings were correctly converted to the new config file format (and make any adjustments if needed).

🔎 Examples

Click here to see all example apps created with GPT Pilot.

🐳 How to start gpt-pilot in docker?

  1. git clone https://github.com/Pythagora-io/gpt-pilot.git (clone the repo)
  2. Update the docker-compose.yml environment variables, which can be done via docker compose config. If you wish to use a local model, please go to https://localai.io/basics/getting_started/.
  3. By default, GPT Pilot will read & write to ~/gpt-pilot-workspace on your machine, you can also edit this in docker-compose.yml
  4. run docker compose build. this will build a gpt-pilot container for you.
  5. run docker compose up.
  6. access the web terminal on port 7681
  7. python main.py (start GPT Pilot)

This will start two containers, one being a new image built by the Dockerfile and a Postgres database. The new image also has ttyd installed so that you can easily interact with gpt-pilot. Node is also installed on the image and port 3000 is exposed.

PostgreSQL support

GPT Pilot uses built-in SQLite database by default. If you want to use the PostgreSQL database, you need to additional install asyncpg and psycopg2 packages:

pip install asyncpg psycopg2

Then, you need to update the config.json file to set db.url to postgresql+asyncpg://<user>:<password>@<db-host>/<db-name>.

🧑‍💻️ CLI arguments

List created projects (apps)

python main.py --list

Note: for each project (app), this also lists "branches". Currently we only support having one branch (called "main"), and in the future we plan to add support for multiple project branches.

Load and continue from the latest step in a project (app)

python main.py --project <app_id>

Load and continue from a specific step in a project (app)

python main.py --project <app_id> --step <step>

Warning: this will delete all progress after the specified step!

Delete project (app)

python main.py --delete <app_id>

Delete project with the specified app_id. Warning: this cannot be undone!

Import projects from v0.1

python main.py --import-v0 <path>

This will import projects from the old GPT Pilot v0.1 database. The path should be the path to the old GPT Pilot v0.1 database. For each project, it will import the start of the latest task you were working on. If the project was already imported, the import procedure will skip it (won't overwrite the project in the database).

Other command-line options

There are several other command-line options that mostly support calling GPT Pilot from our VSCode extension. To see all the available options, use the --help flag:

python main.py --help

🏗 How GPT Pilot works?

Here are the steps GPT Pilot takes to create an app:

  1. You enter the app name and the description.
  2. Product Owner agent like in real life, does nothing. :)
  3. Specification Writer agent asks a couple of questions to understand the requirements better if project description is not good enough.
  4. Architect agent writes up technologies that will be used for the app and checks if all technologies are installed on the machine and installs them if not.
  5. Tech Lead agent writes up development tasks that the Developer must implement.
  6. Developer agent takes each task and writes up what needs to be done to implement it. The description is in human-readable form.
  7. Code Monkey agent takes the Developer's description and the existing file and implements the changes.
  8. Reviewer agent reviews every step of the task and if something is done wrong Reviewer sends it back to Code Monkey.
  9. Troubleshooter agent helps you to give good feedback to GPT Pilot when something is wrong.
  10. Debugger agent hate to see him, but he is your best friend when things go south.
  11. Technical Writer agent writes documentation for the project.

🕴How's GPT Pilot different from Smol developer and GPT engineer?

  • GPT Pilot works with the developer to create a fully working production-ready app - I don't think AI can (at least in the near future) create apps without a developer being involved. So, GPT Pilot codes the app step by step just like a developer would in real life. This way, it can debug issues as they arise throughout the development process. If it gets stuck, you, the developer in charge, can review the code and fix the issue. Other similar tools give you the entire codebase at once - this way, bugs are much harder to fix for AI and for you as a developer.

  • Works at scale - GPT Pilot isn't meant to create simple apps but rather so it can work at any scale. It has mechanisms that filter out the code, so in each LLM conversation, it doesn't need to store the entire codebase in context, but it shows the LLM only the relevant code for the current task it's working on. Once an app is finished, you can continue working on it by writing instructions on what feature you want to add.

🍻 Contributing

If you are interested in contributing to GPT Pilot, join our Discord server, check out open GitHub issues, and see if anything interests you. We would be happy to get help in resolving any of those. The best place to start is by reviewing blog posts mentioned above to understand how the architecture works before diving into the codebase.

🖥 Development

Other than the research, GPT Pilot needs to be debugged to work in different scenarios. For example, we realized that the quality of the code generated is very sensitive to the size of the development task. When the task is too broad, the code has too many bugs that are hard to fix, but when the development task is too narrow, GPT also seems to struggle in getting the task implemented into the existing code.

📊 Telemetry

To improve GPT Pilot, we are tracking some events from which you can opt out at any time. You can read more about it here.

🔗 Connect with us

🌟 As an open-source tool, it would mean the world to us if you starred the GPT-pilot repo 🌟

💬 Join the Discord server to get in touch.

December 31, 1969  23:59:59

A self-paced course to learn Rust, one exercise at a time.


Learn Rust, one exercise at a time

You've heard about Rust, but you never had the chance to try it out?
This course is for you!

You'll learn Rust by solving 100 exercises.
You'll go from knowing nothing about Rust to being able to start writing your own programs, one exercise at a time.

[!NOTE] This course has been written by Mainmatter.
It's one of the trainings in our portfolio of Rust workshops.
Check out our landing page if you're looking for Rust consulting or training!

Getting started

Go to rust-exercises.com and follow the instructions there to get started with the course.

Requirements

  • Rust (follow instructions here).
    If rustup is already installed on your system, run rustup update (or another appropriate command depending on how you installed Rust on your system) to make sure you're running on the latest stable version.
  • (Optional but recommended) An IDE with Rust autocompletion support. We recommend one of the following:

Solutions

You can find the solutions to the exercises in the solutions branch of this repository.

License

Copyright © 2024- Mainmatter GmbH (https://mainmatter.com), released under the Creative Commons Attribution-NonCommercial 4.0 International license.

December 31, 1969  23:59:59

All Algorithms implemented in Python


The Algorithms - Python

Gitpod Ready-to-Code Contributions Welcome Discord chat Gitter chat
GitHub Workflow Status pre-commit code style: black

All algorithms implemented in Python - for education

Implementations are for learning purposes only. They may be less efficient than the implementations in the Python standard library. Use them at your discretion.

Getting Started

Read through our Contribution Guidelines before you contribute.

Community Channels

We are on Discord and Gitter! Community channels are a great way for you to ask questions and get help. Please join us!

List of Algorithms

See our directory for easier navigation and a better overview of the project.

December 31, 1969  23:59:59

Investment Research for Everyone, Everywhere.



OpenBB Platform logo OpenBB Platform logo

Twitter Discord Shield Open in Dev Containers Open In Colab PyPI

The first financial Platform that is free and fully open source.

Offers access to equity, options, crypto, forex, macro economy, fixed income, and more while also offering a broad range of extensions to enhance the user experience according to their needs.

Sign up to the OpenBB Hub to get the most out of the OpenBB ecosystem.

We also have an open source AI financial analyst agent that can access all the data within OpenBB, and that repo can be found here.


If you are looking for the first AI financial terminal for professionals, the OpenBB Terminal Pro can be found at pro.openbb.co

Logo

Table of Contents

  1. Installation
  2. Contributing
  3. License
  4. Disclaimer
  5. Contacts
  6. Star History
  7. Contributors

1. Installation

The OpenBB Platform can be installed as a PyPI package by running pip install openbb

or by cloning the repository directly with git clone https://github.com/OpenBB-finance/OpenBB.git.

Please find more about the installation process in the OpenBB Documentation.

OpenBB Platform CLI installation

The OpenBB Platform CLI is a command-line interface that allows you to access the OpenBB Platform directly from your terminal.

It can be installed by running pip install openbb-cli

or by cloning the repository directly with git clone https://github.com/OpenBB-finance/OpenBB.git.

Please find more about the installation process in the OpenBB Documentation.

The OpenBB Platform CLI offers an alternative to the former OpenBB Terminal as it has the same look and feel while offering the functionalities and extendability of the OpenBB Platform.

2. Contributing

There are three main ways of contributing to this project. (Hopefully you have starred the project by now ⭐️)

Become a Contributor

Create a GitHub ticket

Before creating a ticket make sure the one you are creating doesn't exist already here

Provide feedback

We are most active on our Discord, but feel free to reach out to us in any of our social media for feedback.

3. License

Distributed under the AGPLv3 License. See LICENSE for more information.

4. Disclaimer

Trading in financial instruments involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors.

Before deciding to trade in a financial instrument you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.

The data contained in the OpenBB Platform is not necessarily accurate.

OpenBB and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information displayed.

All names, logos, and brands of third parties that may be referenced in our sites, products or documentation are trademarks of their respective owners. Unless otherwise specified, OpenBB and its products and services are not endorsed by, sponsored by, or affiliated with these third parties. Our use of these names, logos, and brands is for identification purposes only, and does not imply any such endorsement, sponsorship, or affiliation.

5. Contacts

If you have any questions about the terminal or anything OpenBB, feel free to email us at [email protected]

If you want to say hi, or are interested in partnering with us, feel free to reach us at [email protected]

Any of our social media platforms: openbb.co/links

6. Star History

This is a proxy of our growth and that we are just getting started.

But for more metrics important to us check openbb.co/open.

Star History Chart

7. Contributors

OpenBB wouldn't be OpenBB without you. If we are going to disrupt financial industry, every contribution counts. Thank you for being part of this journey.

December 31, 1969  23:59:59

24/7 local AI screen & mic recording. Build AI apps that have the full context. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.


logo

   ___  ___ _ __ ___  ___ _ __  _ __ (_)_ __   ___ 
  / __|/ __| '__/ _ \/ _ \ '_ \| '_ \| | '_ \ / _ \
  \__ \ (__| | |  __/  __/ | | | |_) | | |_) |  __/
  |___/\___|_|  \___|\___|_| |_| .__/|_| .__/ \___|
                               |_|     |_|         

Download the Desktop App

YouTube Channel Subscribers

Join us on Discord X account Rewarded Bounties Open Bounties

Let's chat

demo


Latest News 🔥

  • [2024/09] screenpipe is number 1 github trending repo & on hackernews!
  • [2024/09] 150 users run screenpipe 24/7!
  • [2024/09] Released a v0 of our documentation
  • [2024/08] Anyone can now create, share, install pipes (plugins) from the app interface based on a github repo/dir
  • [2024/08] We're running bounties! Contribute to screenpipe & make money, check issues
  • [2024/08] Audio input & output now works perfect on Windows, Linux, MacOS (<15.0). We also support multi monitor capture and defaulting STT to Whisper Distil large v3
  • [2024/08] We released video embedding. AI gives you links to your video recording in the chat!
  • [2024/08] We released the pipe store! Create, share, use plugins that get you the most out of your data in less than 30s, even if you are not technical.
  • [2024/08] We released Apple & Windows Native OCR.
  • [2024/08] The Linux desktop app is here!.
  • [2024/07] The Windows desktop app is here! Get it now!.
  • [2024/07] 🎁 Screenpipe won Friends (the AI necklace) hackathon at AGI House (integrations soon)
  • [2024/07] We just launched the desktop app! Download now!

24/7 Screen & Audio Capture

Library to build personalized AI powered by what you've seen, said, or heard. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.
We are shipping daily, make suggestions, post bugs, give feedback.

diagram

Why?

Building a reliable stream of audio and screenshot data, where a user simply clicks a button and the script runs in the background 24/7, collecting and extracting data from screen and audio input/output, can be frustrating.

There are numerous use cases that can be built on top of this layer. To simplify life for other developers, we decided to solve this non-trivial problem. It's still in its early stages, but it works end-to-end. We're working on this full-time and would love to hear your feedback and suggestions.

Get started

There are multiple ways to install screenpipe:

  • as a CLI for technical users
  • as a paid desktop app with 1 year updates, priority support, and priority features
  • as a free forever desktop app (but you need to build it yourself). We're 100% OSS.
  • get the desktop app 1 year license by sending a PR (example) or sharing about screenpipe online
  • as a Rust or WASM library - check this websocket to stream frames + OCR to your app
  • as a business

👉 install screenpipe now

usage

screenpipe has a plugin system called "pipe" which lets you run code in a sandboxed environment within the Rust code, get started

examples

check examples

star history

GitHub Star History (10)

Contributing

Contributions are welcome! If you'd like to contribute, please read CONTRIBUTING.md.

Rewarded Bounties Open Bounties