Home 👨‍💻 IndieHack Github Trending - Weekly
author

Github Trending - Weekly

Github weekly trending

December 31, 1969  23:59:59

Interactive roadmaps, guides and other educational content to help developers grow in their careers.


roadmap.sh

Community driven roadmaps, articles and resources for developers

roadmaps best practices videos roadmaps


Roadmaps are now interactive, you can click the nodes to read more about the topics.

View all Roadmaps  ·  Best Practices  ·  Questions

Here is the list of available roadmaps with more being actively worked upon.

Have a look at the get started page that might help you pick up a path.

There are also interactive best practices:

..and questions to help you test, rate and improve your knowledge

Share with the community

Please consider sharing a post about roadmap.sh and the value it provides. It really does help!

GitHub Repo stars GitHub Repo stars GitHub Repo stars GitHub Repo stars GitHub Repo stars

Development

Clone the repository, install the dependencies and start the application

git clone [email protected]:kamranahmedse/developer-roadmap.git
cd developer-roadmap
npm install
npm run dev

Note: use the depth parameter to reduce the clone size and speed up the clone.

git clone --depth=1 https://github.com/kamranahmedse/developer-roadmap.git

Contribution

Have a look at contribution docs for how to update any of the roadmaps

  • Add content to roadmaps
  • Add new roadmaps
  • Suggest changes to existing roadmaps
  • Discuss ideas in issues
  • Spread the word

Thanks to all contributors ❤

License

Have a look at the license file for details

December 31, 1969  23:59:59

Hydra is a game launcher with its own embedded bittorrent client



Hydra Launcher

Hydra is a game launcher with its own embedded bittorrent client.

build release

pt-BR en ru uk-UA be es fr de ita cs da nb et tr

Hydra Catalogue

Table of Contents

About

Hydra is a Game Launcher with its own embedded BitTorrent Client.
The launcher is written in TypeScript (Electron) and Python, which handles the torrenting system by using libtorrent.

Features

  • Own embedded bittorrent client
  • How Long To Beat (HLTB) integration on game page
  • Downloads path customization
  • Windows and Linux support
  • Constantly updated
  • And more ...

Installation

Follow the steps below to install:

  1. Download the latest version of Hydra from the Releases page.
    • Download only .exe if you want to install Hydra on Windows.
    • Download .deb or .rpm or .zip if you want to install Hydra on Linux. (depends on your Linux distro)
  2. Run the downloaded file.
  3. Enjoy Hydra!

Contributing

Join our Telegram

We concentrate our discussions on our Telegram channel.

Fork and clone your repository

  1. Fork the repository (click here to fork now)
  2. Clone your forked code git clone https://github.com/your_username/hydra
  3. Create a new branch
  4. Push your commits
  5. Submit a new Pull Request

Ways you can contribute

  • Translation: We want Hydra to be available to as many people as possible. Feel free to help translate to new languages or update and improve the ones that are already available on Hydra.
  • Code: Hydra is built with Typescript, Electron and a little bit of Python. If you want to contribute, join our Telegram!

Project Structure

  • torrent-client: We use libtorrent, a Python library, to manage torrent downloads
  • src/renderer: the UI of the application
  • src/main: all the logic rests here.

Build from source

Install Node.js

Ensure you have Node.js installed on your machine. If not, download and install it from nodejs.org.

Install Yarn

Yarn is a package manager for Node.js. If you haven't installed Yarn yet, you can do so by following the instructions on yarnpkg.com.

Install Node Dependencies

Navigate to the project directory and install the Node dependencies using Yarn:

cd hydra
yarn

Install OpenSSL 1.1

OpenSSL 1.1 is required by libtorrent in Windows environments.

Install Python 3.9

Ensure you have Python 3.9 installed on your machine. You can download and install it from python.org.

Install Python Dependencies

Install the required Python dependencies using pip:

pip install -r requirements.txt

Environment variables

You'll need an SteamGridDB API Key in order to fetch the game icons on installation.

Once you have it, you can copy or rename the .env.example file to .env and put it onSTEAMGRIDDB_API_KEY.

Running

Once you've got all things set up, you can run the following command to start both the Electron process and the bittorrent client:

yarn dev

Build

Build the bittorrent client

Build the bittorrent client by using this command:

python torrent-client/setup.py build

Build the Electron application

Build the Electron application by using this command:

On Windows:

yarn build:win

On Linux:

yarn build:linux

Contributors

License

Hydra is licensed under the MIT License.

December 31, 1969  23:59:59

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).


程序员做饭指南

build License GitHub contributors npm Man hours Docker

最近宅在家做饭,作为程序员,我偶尔在网上找找菜谱和做法。但是这些菜谱往往写法千奇百怪,经常中间莫名出来一些材料。对于习惯了形式语言的程序员来说极其不友好。

所以,我计划自己搜寻菜谱并结合实际做菜的经验,准备用更清晰精准的描述来整理常见菜的做法,以方便程序员在家做饭。

同样,我希望它是一个由社区驱动和维护的开源项目,使更多人能够一起做一个有趣的仓库。所以非常欢迎大家贡献它~

本地部署

如果需要在本地部署菜谱 Web 服务,可以在安装 Docker 后运行下面命令:

docker pull ghcr.io/anduin2017/how-to-cook:latest
docker run -d -p 5000:5000 ghcr.io/anduin2017/how-to-cook:latest

如需下载 PDF 版本,可以在浏览器中访问 /document.pdf

如何贡献

针对发现的问题,直接修改并提交 Pull request 即可。

在写新菜谱时,请复制并修改已有的菜谱模板: 示例菜

搭建环境

菜谱

按难度索引

素菜

荤菜

水产

早餐

主食

半成品加工

汤与粥

饮料

酱料和其它材料

甜品

进阶知识学习

如果你已经做了许多上面的菜,对于厨艺已经入门,并且想学习更加高深的烹饪技巧,请继续阅读下面的内容:

December 31, 1969  23:59:59

A free and open source, self hosted Ai based live meeting note taker and minutes summary generator that can completely run in your Local device (Mac OS and windows OS Support added. Working on adding linux support soon) https://meetily.zackriya.com/



Meetily - AI-Powered Meeting Assistant


Pre-Release Stars License Supported OS


Open source Ai Assistant for taking meeting notes

WebsiteAuthorDiscord Channel

An AI-Powered Meeting Assistant that captures live meeting audio, transcribes it in real-time, and generates summaries while ensuring user privacy. Perfect for teams who want to focus on discussions while automatically capturing and organizing meeting content without the need for external servers or complex infrastructure.

Meetily Demo
View full Demo Video

Overview

An AI-powered meeting assistant that captures live meeting audio, transcribes it in real-time, and generates summaries while ensuring user privacy. Perfect for teams who want to focus on discussions while automatically capturing and organizing meeting content.

Why?

While there are many meeting transcription tools available, this solution stands out by offering:

  • Privacy First: All processing happens locally on your device
  • Cost Effective: Uses open-source AI models instead of expensive APIs
  • Flexible: Works offline, supports multiple meeting platforms
  • Customizable: Self-host and modify for your specific needs
  • Intelligent: Built-in knowledge graph for semantic search across meetings

Features

✅ Modern, responsive UI with real-time updates

✅ Real-time audio capture (microphone + system audio)

✅ Live transcription using Whisper.cpp

🚧 Speaker diarization

✅ Local processing for privacy

✅ Packaged the app for macOS and Windows

🚧 Export to Markdown/PDF

Note: We have a Rust-based implementation that explores better performance and native integration. It currently implements:

  • ✅ Real-time audio capture from both microphone and system audio
  • ✅ Live transcription using locally-running Whisper
  • ✅ Speaker diarization
  • ✅ Rich text editor for notes

We are currently working on:

  • ✅ Export to Markdown/PDF
  • ✅ Export to HTML

Release 0.0.3

A new release is available!

Please check out the release here.

What's New

  • Windows Support: Fixed audio capture issues on Windows
  • Improved Error Handling: Better error handling and logging for audio devices
  • Enhanced Device Detection: More robust audio device detection across platforms
  • Windows Installers: Added both .exe and .msi installers for Windows
  • Transcription quality is improved
  • Bug fixes and improvements for frontend
  • Better backend app build process
  • Improved documentation

What would be next?

  • Database connection to save meeting minutes
  • Improve summarization quality for smaller LLM models
  • Add download options for meeting transcriptions
  • Add download option for summary

Known issues

  • Smaller LLMs can hallucinate, making summarization quality poor; Please use model above 32B parameter size
  • Backend build process requires CMake, C++ compiler, etc. Making it harder to build
  • Backend build process requires Python 3.10 or newer
  • Frontend build process requires Node.js

LLM Integration

The backend supports multiple LLM providers through a unified interface. Current implementations include:

Supported Providers

  • Anthropic (Claude models)
  • Groq (Llama3.2 90 B)
  • Ollama (Local models that supports function calling)

Configuration

Create .env file with your API keys:

# Required for Anthropic
ANTHROPIC_API_KEY=your_key_here  

# Required for Groq 
GROQ_API_KEY=your_key_here

System Architecture

High Level Architecture

Core Components

  1. Audio Capture Service

    • Real-time microphone/system audio capture
    • Audio preprocessing pipeline
    • Built with Rust (experimental) and Python
  2. Transcription Engine

    • Whisper.cpp for local transcription
    • Supports multiple model sizes (tiny->large)
    • GPU-accelerated processing
  3. LLM Orchestrator

    • Unified interface for multiple providers
    • Automatic fallback handling
    • Chunk processing with overlap
    • Model configuration:
  4. Data Services

    • ChromaDB: Vector store for transcript embeddings
    • SQLite: Process tracking and metadata storage
  5. API Layer

    • FastAPI endpoints:
      • POST /upload
      • POST /process
      • GET /summary/{id}
      • DELETE /summary/{id}

Deployment Architecture

  • Frontend: Tauri app + Next.js (packaged executables)
  • Backend: Python FastAPI:
    • Transcript workers
    • LLM inference

Prerequisites

  • Node.js 18+
  • Python 3.10+
  • FFmpeg
  • Rust 1.65+ (for experimental features)
  • Cmake 3.22+ (for building the frontend)
  • For Windows: Visual Studio Build Tools with C++ development workload

Setup Instructions

1. Frontend Setup

Run packaged version

Go to the releases page and download the latest version.

For Windows:

  • Download either the .exe installer or .msi package
  • Once the installer is downloaded, double-click the executable file to run it
  • Windows will ask if you want to run untrusted apps, click "More info" and choose "Run anyway"
  • Follow the installation wizard to complete the setup
  • The application will be installed and available on your desktop

For macOS:

  • Download the dmg_darwin_arch64.zip file
  • Extract the file
  • Double-click the .dmg file inside the extracted folder
  • Drag the application to your Applications folder
  • Execute the following command in terminal to remove the quarantine attribute:
    xattr -c /Applications/meeting-minutes-frontend.app
    

Provide necessary permissions for audio capture and microphone access.

Dev run

# Navigate to frontend directory
cd frontend

# Give execute permissions to clean_build.sh
chmod +x clean_build.sh

# run clean_build.sh
./clean_build.sh

2. Backend Setup

# Clone the repository
git clone https://github.com/Zackriya-Solutions/meeting-minutes.git
cd meeting-minutes/backend

# Create and activate virtual environment
# On macOS/Linux:
python -m venv venv
source venv/bin/activate

# On Windows:
python -m venv venv
.\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Add environment file with API keys
# On macOS/Linux:
echo -e "ANTHROPIC_API_KEY=your_api_key\nGROQ_API_KEY=your_api_key" | tee .env

# On Windows (PowerShell):
"ANTHROPIC_API_KEY=your_api_key`nGROQ_API_KEY=your_api_key" | Out-File -FilePath .env -Encoding utf8

# Configure environment variables for Groq
# On macOS/Linux:
export GROQ_API_KEY=your_groq_api_key

# On Windows (PowerShell):
$env:GROQ_API_KEY="your_groq_api_key"

# Build dependencies
# On macOS/Linux:
chmod +x build_whisper.sh
./build_whisper.sh

# On Windows:
.\build_whisper.bat

# Start backend servers
# On macOS/Linux:
./clean_start_backend.sh

# On Windows:
.\start_with_output.ps1

Development Guidelines

  • Follow the established project structure
  • Write tests for new features
  • Document API changes
  • Use type hints in Python code
  • Follow ESLint configuration for JavaScript/TypeScript

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Submit a pull request

License

MIT License - Feel free to use this project for your own purposes.

Introducing Subscription

We are planning to add a subscription option so that you don't have to run the backend on your own server. This will help you scale better and run the service 24/7. This is based on a few requests we received. If you are interested, please fill out the form here.

Last updated: March 3, 2025

Star History

Star History Chart

December 31, 1969  23:59:59

An AI Hedge Fund Team


AI Hedge Fund

This is a proof of concept for an AI-powered hedge fund. The goal of this project is to explore the use of AI to make trading decisions. This project is for educational purposes only and is not intended for real trading or investment.

This system employs several agents working together:

  1. Ben Graham Agent - The godfather of value investing, only buys hidden gems with a margin of safety
  2. Bill Ackman Agent - An activist investors, takes bold positions and pushes for change
  3. Cathie Wood Agent - The queen of growth investing, believes in the power of innovation and disruption
  4. Charlie Munger Agent - Warren Buffett's partner, only buys wonderful businesses at fair prices
  5. Michael Burry Agent - The Big Short contrarian who hunts for deep value
  6. Peter Lynch Agent - Practical investor who seeks "ten-baggers" in everyday businesses
  7. Phil Fisher Agent - Meticulous growth investor who uses deep "scuttlebutt" research
  8. Stanley Druckenmiller Agent - Macro legend who hunts for asymmetric opportunities with growth potential
  9. Warren Buffett Agent - The oracle of Omaha, seeks wonderful companies at a fair price
  10. Valuation Agent - Calculates the intrinsic value of a stock and generates trading signals
  11. Sentiment Agent - Analyzes market sentiment and generates trading signals
  12. Fundamentals Agent - Analyzes fundamental data and generates trading signals
  13. Technicals Agent - Analyzes technical indicators and generates trading signals
  14. Risk Manager - Calculates risk metrics and sets position limits
  15. Portfolio Manager - Makes final trading decisions and generates orders
Screenshot 2025-03-22 at 6 19 07 PM

Note: the system simulates trading decisions, it does not actually trade.

Twitter Follow

Disclaimer

This project is for educational and research purposes only.

  • Not intended for real trading or investment
  • No warranties or guarantees provided
  • Past performance does not indicate future results
  • Creator assumes no liability for financial losses
  • Consult a financial advisor for investment decisions

By using this software, you agree to use it solely for learning purposes.

Table of Contents

Setup

Using Poetry

Clone the repository:

git clone https://github.com/virattt/ai-hedge-fund.git
cd ai-hedge-fund
  1. Install Poetry (if not already installed):
curl -sSL https://install.python-poetry.org | python3 -
  1. Install dependencies:
poetry install
  1. Set up your environment variables:
# Create .env file for your API keys
cp .env.example .env
  1. Set your API keys:
# For running LLMs hosted by openai (gpt-4o, gpt-4o-mini, etc.)
# Get your OpenAI API key from https://platform.openai.com/
OPENAI_API_KEY=your-openai-api-key

# For running LLMs hosted by groq (deepseek, llama3, etc.)
# Get your Groq API key from https://groq.com/
GROQ_API_KEY=your-groq-api-key

# For getting financial data to power the hedge fund
# Get your Financial Datasets API key from https://financialdatasets.ai/
FINANCIAL_DATASETS_API_KEY=your-financial-datasets-api-key

Using Docker

  1. Make sure you have Docker installed on your system. If not, you can download it from Docker's official website.

  2. Clone the repository:

git clone https://github.com/virattt/ai-hedge-fund.git
cd ai-hedge-fund
  1. Set up your environment variables:
# Create .env file for your API keys
cp .env.example .env
  1. Edit the .env file to add your API keys as described above.

  2. Build the Docker image:

# On Linux/Mac:
./run.sh build

# On Windows:
run.bat build

Important: You must set OPENAI_API_KEY, GROQ_API_KEY, ANTHROPIC_API_KEY, or DEEPSEEK_API_KEY for the hedge fund to work. If you want to use LLMs from all providers, you will need to set all API keys.

Financial data for AAPL, GOOGL, MSFT, NVDA, and TSLA is free and does not require an API key.

For any other ticker, you will need to set the FINANCIAL_DATASETS_API_KEY in the .env file.

Usage

Running the Hedge Fund

With Poetry

poetry run python src/main.py --ticker AAPL,MSFT,NVDA

With Docker

# On Linux/Mac:
./run.sh --ticker AAPL,MSFT,NVDA main

# On Windows:
run.bat --ticker AAPL,MSFT,NVDA main

Example Output: Screenshot 2025-01-06 at 5 50 17 PM

You can also specify a --ollama flag to run the AI hedge fund using local LLMs.

# With Poetry:
poetry run python src/main.py --ticker AAPL,MSFT,NVDA --ollama

# With Docker (on Linux/Mac):
./run.sh --ticker AAPL,MSFT,NVDA --ollama main

# With Docker (on Windows):
run.bat --ticker AAPL,MSFT,NVDA --ollama main

You can also specify a --show-reasoning flag to print the reasoning of each agent to the console.

# With Poetry:
poetry run python src/main.py --ticker AAPL,MSFT,NVDA --show-reasoning

# With Docker (on Linux/Mac):
./run.sh --ticker AAPL,MSFT,NVDA --show-reasoning main

# With Docker (on Windows):
run.bat --ticker AAPL,MSFT,NVDA --show-reasoning main

You can optionally specify the start and end dates to make decisions for a specific time period.

# With Poetry:
poetry run python src/main.py --ticker AAPL,MSFT,NVDA --start-date 2024-01-01 --end-date 2024-03-01 

# With Docker (on Linux/Mac):
./run.sh --ticker AAPL,MSFT,NVDA --start-date 2024-01-01 --end-date 2024-03-01 main

# With Docker (on Windows):
run.bat --ticker AAPL,MSFT,NVDA --start-date 2024-01-01 --end-date 2024-03-01 main

Running the Backtester

With Poetry

poetry run python src/backtester.py --ticker AAPL,MSFT,NVDA

With Docker

# On Linux/Mac:
./run.sh --ticker AAPL,MSFT,NVDA backtest

# On Windows:
run.bat --ticker AAPL,MSFT,NVDA backtest

Example Output: Screenshot 2025-01-06 at 5 47 52 PM

You can optionally specify the start and end dates to backtest over a specific time period.

# With Poetry:
poetry run python src/backtester.py --ticker AAPL,MSFT,NVDA --start-date 2024-01-01 --end-date 2024-03-01

# With Docker (on Linux/Mac):
./run.sh --ticker AAPL,MSFT,NVDA --start-date 2024-01-01 --end-date 2024-03-01 backtest

# With Docker (on Windows):
run.bat --ticker AAPL,MSFT,NVDA --start-date 2024-01-01 --end-date 2024-03-01 backtest

You can also specify a --ollama flag to run the backtester using local LLMs.

# With Poetry:
poetry run python src/backtester.py --ticker AAPL,MSFT,NVDA --ollama

# With Docker (on Linux/Mac):
./run.sh --ticker AAPL,MSFT,NVDA --ollama backtest

# With Docker (on Windows):
run.bat --ticker AAPL,MSFT,NVDA --ollama backtest

Project Structure

ai-hedge-fund/
├── src/
│   ├── agents/                   # Agent definitions and workflow
│   │   ├── bill_ackman.py        # Bill Ackman agent
│   │   ├── fundamentals.py       # Fundamental analysis agent
│   │   ├── portfolio_manager.py  # Portfolio management agent
│   │   ├── risk_manager.py       # Risk management agent
│   │   ├── sentiment.py          # Sentiment analysis agent
│   │   ├── technicals.py         # Technical analysis agent
│   │   ├── valuation.py          # Valuation analysis agent
│   │   ├── ...                   # Other agents
│   │   ├── warren_buffett.py     # Warren Buffett agent
│   ├── tools/                    # Agent tools
│   │   ├── api.py                # API tools
│   ├── backtester.py             # Backtesting tools
│   ├── main.py # Main entry point
├── pyproject.toml
├── ...

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

Important: Please keep your pull requests small and focused. This will make it easier to review and merge.

Feature Requests

If you have a feature request, please open an issue and make sure it is tagged with enhancement.

License

This project is licensed under the MIT License - see the LICENSE file for details.

December 31, 1969  23:59:59

An AI web browsing framework focused on simplicity and extensibility.


The production-ready framework for AI browser automations.
Read the Docs

MIT License Slack Community

browserbase%2Fstagehand | Trendshift

Why Stagehand?

Most existing browser automation tools either require you to write low-level code in a framework like Selenium, Playwright, or Puppeteer, or use high-level agents that can be unpredictable in production. By letting developers choose what to write in code vs. natural language, Stagehand is the natural choice for browser automations in production.

  1. Choose when to write code vs. natural language: use AI when you want to navigate unfamiliar pages, and use code (Playwright) when you know exactly what you want to do.

  2. Preview and cache actions: Stagehand lets you preview AI actions before running them, and also helps you easily cache repeatable actions to save time and tokens.

  3. Computer use models with one line of code: Stagehand lets you integrate SOTA computer use models from OpenAI and Anthropic into the browser with one line of code.

Example

Here's how to build a sample browser automation with Stagehand:

See Stagehand in Action
// Use Playwright functions on the page object
const page = stagehand.page;
await page.goto("https://github.com/browserbase");

// Use act() to execute individual actions
await page.act("click on the stagehand repo");

// Use Computer Use agents for larger actions
const agent = stagehand.agent({
    provider: "openai",
    model: "computer-use-preview",
});
await agent.execute("Get to the latest PR");

// Use extract() to read data from the page
const { author, title } = await page.extract({
  instruction: "extract the author and title of the PR",
  schema: z.object({
    author: z.string().describe("The username of the PR author"),
    title: z.string().describe("The title of the PR"),
  }),
});

Documentation

Visit docs.stagehand.dev to view the full documentation.

Getting Started

Start with Stagehand with one line of code, or check out our Quickstart Guide for more information:

npx create-browser-app

Build and Run from Source

git clone https://github.com/browserbase/stagehand.git
cd stagehand
npm install
npx playwright install
npm run build
npm run example # run the blank script at ./examples/example.ts

Stagehand is best when you have an API key for an LLM provider and Browserbase credentials. To add these to your project, run:

cp .env.example .env
nano .env # Edit the .env file to add API keys

Contributing

[!NOTE]
We highly value contributions to Stagehand! For questions or support, please join our Slack community.

At a high level, we're focused on improving reliability, speed, and cost in that order of priority. If you're interested in contributing, we strongly recommend reaching out to Anirudh Kamath or Paul Klein in our Slack community before starting to ensure that your contribution aligns with our goals.

For more information, please see our Contributing Guide.

Acknowledgements

This project heavily relies on Playwright as a resilient backbone to automate the web. It also would not be possible without the awesome techniques and discoveries made by tarsier, gemini-zod, and fuji-web.

We'd like to thank the following people for their major contributions to Stagehand:

License

Licensed under the MIT License.

Copyright 2025 Browserbase, Inc.

December 31, 1969  23:59:59

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero


PDF scientific paper translation and bilingual comparison.

Feel free to provide feedback in GitHub Issues or Telegram Group.

For details on how to contribute, please consult the Contribution Guide.

Updates

  • [Mar. 3, 2025] Experimental support for the new backend BabelDOC WebUI added as an experimental option (by @awwaawwa)
  • [Feb. 22 2025] Better release CI and well-packaged windows-amd64 exe (by @awwaawwa)
  • [Dec. 24 2024] The translator now supports local models on Xinference (by @imClumsyPanda)
  • [Dec. 19 2024] Non-PDF/A documents are now supported using -cp (by @reycn)
  • [Dec. 13 2024] Additional support for backend by (by @YadominJinta)
  • [Dec. 10 2024] The translator now supports OpenAI models on Azure (by @yidasanqian)

Preview

Online Service 🌟

You can try our application out using either of the following demos:

Note that the computing resources of the demo are limited, so please avoid abusing them.

Installation and Usage

Methods

For different use cases, we provide distinct methods to use our program:

1. UV install
  1. Python installed (3.10 <= version <= 3.12)

  2. Install our package:

    pip install uv
    uv tool install --python 3.12 pdf2zh
    
  3. Execute translation, files generated in current working directory:

    pdf2zh document.pdf
    
2. Windows exe
  1. Download pdf2zh-version-win64.zip from release page

  2. Unzip and double-click pdf2zh.exe to run.

3. Graphic user interface 1. Python installed (3.10 <= version <= 3.12) 2. Install our package:
pip install pdf2zh
  1. Start using in browser:

    pdf2zh -i
    
  2. If your browswer has not been started automatically, goto

    http://localhost:7860/
    

See documentation for GUI for more details.

4. Docker
  1. Pull and run:

    docker pull byaidu/pdf2zh
    docker run -d -p 7860:7860 byaidu/pdf2zh
    
  2. Open in browser:

    http://localhost:7860/
    

For docker deployment on cloud service:

5. Zotero Plugin

See Zotero PDF2zh for more details.

6. Commandline
  1. Python installed (3.10 <= version <= 3.12)

  2. Install our package:

    pip install pdf2zh
    
  3. Execute translation, files generated in current working directory:

    pdf2zh document.pdf
    

[!TIP]

docker pull ghcr.io/byaidu/pdfmathtranslate
docker run -d -p 7860:7860 ghcr.io/byaidu/pdfmathtranslate

Unable to install?

The present program needs an AI model(wybxc/DocLayout-YOLO-DocStructBench-onnx) before working and some users are not able to download due to network issues. If you have a problem with downloading this model, we provide a workaround using the following environment variable:

set HF_ENDPOINT=https://hf-mirror.com

For PowerShell user:

$env:HF_ENDPOINT = https://hf-mirror.com

If the solution does not work to you / you encountered other issues, please refer to frequently asked questions.

Advanced Options

Execute the translation command in the command line to generate the translated document example-mono.pdf and the bilingual document example-dual.pdf in the current working directory. Use Google as the default translation service. More support translation services can find HERE.

cmd

In the following table, we list all advanced options for reference:

Option Function Example
files Local files pdf2zh ~/local.pdf
links Online files pdf2zh http://arxiv.org/paper.pdf
-i Enter GUI pdf2zh -i
-p Partial document translation pdf2zh example.pdf -p 1
-li Source language pdf2zh example.pdf -li en
-lo Target language pdf2zh example.pdf -lo zh
-s Translation service pdf2zh example.pdf -s deepl
-t Multi-threads pdf2zh example.pdf -t 1
-o Output dir pdf2zh example.pdf -o output
-f, -c Exceptions pdf2zh example.pdf -f "(MS.*)"
-cp Compatibility Mode pdf2zh example.pdf --compatible
--skip-subset-fonts Skip font subset pdf2zh example.pdf --skip-subset-fonts
--ignore-cache Ignore translate cache pdf2zh example.pdf --ignore-cache
--share Public link pdf2zh -i --share
--authorized Authorization pdf2zh -i --authorized users.txt [auth.html]
--prompt Custom Prompt pdf2zh --prompt [prompt.txt]
--onnx [Use Custom DocLayout-YOLO ONNX model] pdf2zh --onnx [onnx/model/path]
--serverport [Use Custom WebUI port] pdf2zh --serverport 7860
--dir [batch translate] pdf2zh --dir /path/to/translate/
--config configuration file pdf2zh --config /path/to/config/config.json
--serverport [custom gradio server port] pdf2zh --serverport 7860
--babeldoc Use Experimental backend BabelDOC to translate pdf2zh --babeldoc -s openai example.pdf
--mcp Enable MCP STDIO mode pdf2zh --mcp
--sse Enable MCP SSE mode pdf2zh --mcp --sse

For detailed explanations, please refer to our document about Advanced Usage for a full list of each option.

Secondary Development (APIs)

For downstream applications, please refer to our document about API Details for futher information about:

  • Python API, how to use the program in other Python programs
  • HTTP API, how to communicate with a server with the program installed

TODOs

  • Parse layout with DocLayNet based models, PaddleX, PaperMage, SAM2

  • Fix page rotation, table of contents, format of lists

  • Fix pixel formula in old papers

  • Async retry except KeyboardInterrupt

  • Knuth–Plass algorithm for western languages

  • Support non-PDF/A files

  • Plugins of Zotero and Obsidian

Acknowledgements

Contributors

Alt

Star History

Star History Chart
December 31, 1969  23:59:59

🚀 The fast, Pythonic way to build MCP servers and clients


FastMCP v2 🚀

The fast, Pythonic way to build MCP servers and clients.

Docs PyPI - Version Tests License

jlowin%2Ffastmcp | Trendshift

The Model Context Protocol (MCP) is a new, standardized way to provide context and tools to your LLMs, and FastMCP makes building MCP servers and clients simple and intuitive. Create tools, expose resources, define prompts, and connect components with clean, Pythonic code.

# server.py
from fastmcp import FastMCP

mcp = FastMCP("Demo 🚀")

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b

if __name__ == "__main__":
    mcp.run()

Run the server locally:

fastmcp run server.py

FastMCP handles the complex protocol details and server management, letting you focus on building great tools and applications. It's designed to feel natural to Python developers.

Table of Contents

What is MCP?

The Model Context Protocol (MCP) lets you build servers that expose data and functionality to LLM applications in a secure, standardized way. Think of it like a web API, but specifically designed for LLM interactions. MCP servers can:

  • Expose data through Resources (think GET endpoints; load info into context)
  • Provide functionality through Tools (think POST/PUT endpoints; execute actions)
  • Define interaction patterns through Prompts (reusable templates)
  • And more!

FastMCP provides a high-level, Pythonic interface for building and interacting with these servers.

Why FastMCP?

The MCP protocol is powerful but implementing it involves a lot of boilerplate - server setup, protocol handlers, content types, error management. FastMCP handles all the complex protocol details and server management, so you can focus on building great tools. It's designed to be high-level and Pythonic; in most cases, decorating a function is all you need.

FastMCP aims to be:

🚀 Fast: High-level interface means less code and faster development

🍀 Simple: Build MCP servers with minimal boilerplate

🐍 Pythonic: Feels natural to Python developers

🔍 Complete: FastMCP aims to provide a full implementation of the core MCP specification for both servers and clients

Key Features

Servers

  • Create servers with minimal boilerplate using intuitive decorators
  • Proxy existing servers to modify configuration or transport
  • Compose servers by into complex applications
  • Generate servers from OpenAPI specs or FastAPI objects

Clients

  • Interact with MCP servers programmatically
  • Connect to any MCP server using any transport
  • Test your servers without manual intervention
  • Innovate with core MCP capabilities like LLM sampling

What's New in v2?

FastMCP 1.0 made it so easy to build MCP servers that it's now part of the official Model Context Protocol Python SDK! For basic use cases, you can use the upstream version by importing mcp.server.fastmcp.FastMCP (or installing fastmcp=1.0).

Based on how the MCP ecosystem is evolving, FastMCP 2.0 builds on that foundation to introduce a variety of new features (and more experimental ideas). It adds advanced features like proxying and composing MCP servers, as well as automatically generating them from OpenAPI specs or FastAPI objects. FastMCP 2.0 also introduces new client-side functionality like LLM sampling.

Documentation

📚 FastMCP's documentation is available at gofastmcp.com.


Installation

We strongly recommend installing FastMCP with uv, as it is required for deploying servers via the CLI:

uv pip install fastmcp

Note: on macOS, uv may need to be installed with Homebrew (brew install uv) in order to make it available to the Claude Desktop app.

For development, install with:

# Clone the repo first
git clone https://github.com/jlowin/fastmcp.git
cd fastmcp
# Install with dev dependencies
uv sync

Quickstart

Let's create a simple MCP server that exposes a calculator tool and some data:

# server.py
from fastmcp import FastMCP

# Create an MCP server
mcp = FastMCP("Demo")

# Add an addition tool
@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers"""
    return a + b

# Add a dynamic greeting resource
@mcp.resource("greeting://{name}")
def get_greeting(name: str) -> str:
    """Get a personalized greeting"""
    return f"Hello, {name}!"

You can install this server in Claude Desktop and interact with it right away by running:

fastmcp install server.py

MCP Inspector

Core Concepts

These are the building blocks for creating MCP servers, using the familiar decorator-based approach.

The FastMCP Server

The central object representing your MCP application. It handles connections, protocol details, and routing.

from fastmcp import FastMCP

# Create a named server
mcp = FastMCP("My App")

# Specify dependencies needed when deployed via `fastmcp install`
mcp = FastMCP("My App", dependencies=["pandas", "numpy"])

Tools

Tools allow LLMs to perform actions by executing your Python functions. They are ideal for tasks that involve computation, external API calls, or side effects.

Decorate synchronous or asynchronous functions with @mcp.tool(). FastMCP automatically generates the necessary MCP schema based on type hints and docstrings. Pydantic models can be used for complex inputs.

import httpx
from pydantic import BaseModel

class UserInfo(BaseModel):
    user_id: int
    notify: bool = False

@mcp.tool()
async def send_notification(user: UserInfo, message: str) -> dict:
    """Sends a notification to a user if requested."""
    if user.notify:
        # Simulate sending notification
        print(f"Notifying user {user.user_id}: {message}")
        return {"status": "sent", "user_id": user.user_id}
    return {"status": "skipped", "user_id": user.user_id}

@mcp.tool()
def get_stock_price(ticker: str) -> float:
    """Gets the current price for a stock ticker."""
    # Replace with actual API call
    prices = {"AAPL": 180.50, "GOOG": 140.20}
    return prices.get(ticker.upper(), 0.0)

Resources

Resources expose data to LLMs. They should primarily provide information without significant computation or side effects (like GET requests).

Decorate functions with @mcp.resource("your://uri"). Use curly braces {} in the URI to define dynamic resources (templates) where parts of the URI become function parameters.

# Static resource returning simple text
@mcp.resource("config://app-version")
def get_app_version() -> str:
    """Returns the application version."""
    return "v2.1.0"

# Dynamic resource template expecting a 'user_id' from the URI
@mcp.resource("db://users/{user_id}/email")
async def get_user_email(user_id: str) -> str:
    """Retrieves the email address for a given user ID."""
    # Replace with actual database lookup
    emails = {"123": "[email protected]", "456": "[email protected]"}
    return emails.get(user_id, "[email protected]")

# Resource returning JSON data
@mcp.resource("data://product-categories")
def get_categories() -> list[str]:
    """Returns a list of available product categories."""
    return ["Electronics", "Books", "Home Goods"]

Prompts

Prompts define reusable templates or interaction patterns for the LLM. They help guide the LLM on how to use your server's capabilities effectively.

Decorate functions with @mcp.prompt(). The function should return the desired prompt content, which can be a simple string, a Message object (like UserMessage or AssistantMessage), or a list of these.

from fastmcp.prompts.base import UserMessage, AssistantMessage

@mcp.prompt()
def ask_review(code_snippet: str) -> str:
    """Generates a standard code review request."""
    return f"Please review the following code snippet for potential bugs and style issues:\n```python\n{code_snippet}\n```"

@mcp.prompt()
def debug_session_start(error_message: str) -> list[Message]:
    """Initiates a debugging help session."""
    return [
        UserMessage(f"I encountered an error:\n{error_message}"),
        AssistantMessage("Okay, I can help with that. Can you provide the full traceback and tell me what you were trying to do?")
    ]

Context

Gain access to MCP server capabilities within your tool or resource functions by adding a parameter type-hinted with fastmcp.Context.

from fastmcp import Context, FastMCP

mcp = FastMCP("Context Demo")

@mcp.resource("system://status")
async def get_system_status(ctx: Context) -> dict:
    """Checks system status and logs information."""
    await ctx.info("Checking system status...")
    # Perform checks
    await ctx.report_progress(1, 1) # Report completion
    return {"status": "OK", "load": 0.5, "client": ctx.client_id}

@mcp.tool()
async def process_large_file(file_uri: str, ctx: Context) -> str:
    """Processes a large file, reporting progress and reading resources."""
    await ctx.info(f"Starting processing for {file_uri}")
    # Read the resource using the context
    file_content_resource = await ctx.read_resource(file_uri)
    file_content = file_content_resource[0].content # Assuming single text content
    lines = file_content.splitlines()
    total_lines = len(lines)

    for i, line in enumerate(lines):
        # Process line...
        if (i + 1) % 100 == 0: # Report progress every 100 lines
            await ctx.report_progress(i + 1, total_lines)

    await ctx.info(f"Finished processing {file_uri}")
    return f"Processed {total_lines} lines."

The Context object provides:

  • Logging: ctx.debug(), ctx.info(), ctx.warning(), ctx.error()
  • Progress Reporting: ctx.report_progress(current, total)
  • Resource Access: await ctx.read_resource(uri)
  • Request Info: ctx.request_id, ctx.client_id
  • Sampling (Advanced): await ctx.sample(...) to ask the connected LLM client for completions.

Images

Easily handle image input and output using the fastmcp.Image helper class.

from fastmcp import FastMCP, Image
from PIL import Image as PILImage
import io

mcp = FastMCP("Image Demo")

@mcp.tool()
def create_thumbnail(image_data: Image) -> Image:
    """Creates a 100x100 thumbnail from the provided image."""
    img = PILImage.open(io.BytesIO(image_data.data)) # Assumes image_data received as Image with bytes
    img.thumbnail((100, 100))
    buffer = io.BytesIO()
    img.save(buffer, format="PNG")
    # Return a new Image object with the thumbnail data
    return Image(data=buffer.getvalue(), format="png")

@mcp.tool()
def load_image_from_disk(path: str) -> Image:
    """Loads an image from the specified path."""
    # Handles reading file and detecting format based on extension
    return Image(path=path)

FastMCP handles the conversion to/from the base64-encoded format required by the MCP protocol.

MCP Clients

The Client class lets you interact with any MCP server (not just FastMCP ones) from Python code:

from fastmcp import Client

async with Client("path/to/server") as client:
    # Call a tool
    result = await client.call_tool("weather", {"location": "San Francisco"})
    print(result)
    
    # Read a resource
    res = await client.read_resource("db://users/123/profile")
    print(res)

You can connect to servers using any supported transport protocol (Stdio, SSE, FastMCP, etc.). If you don't specify a transport, the Client class automatically attempts to detect an appropriate one from your connection string or server object.

Client Methods

The Client class exposes several methods for interacting with MCP servers.

async with Client("path/to/server") as client:
    # List available tools
    tools = await client.list_tools()
    
    # List available resources
    resources = await client.list_resources()
    
    # Call a tool with arguments
    result = await client.call_tool("generate_report", {"user_id": 123})
    
    # Read a resource
    user_data = await client.read_resource("db://users/123/profile")
        
    # Get a prompt
    greeting = await client.get_prompt("welcome", {"name": "Alice"})
    
    # Send progress updates
    await client.progress("task-123", 50, 100)  # 50% complete
    
    # Basic connectivity testing
    await client.ping()

These methods correspond directly to MCP protocol operations, making it easy to interact with any MCP-compatible server (not just FastMCP ones).

Transport Options

FastMCP supports various transport protocols for connecting to MCP servers:

from fastmcp import Client
from fastmcp.client.transports import (
    SSETransport, 
    PythonStdioTransport, 
    FastMCPTransport
)

# Connect to a server over SSE (common for web-based MCP servers)
async with Client(SSETransport("http://localhost:8000/mcp")) as client:
    # Use client here...

# Connect to a Python script using stdio (useful for local tools)
async with Client(PythonStdioTransport("path/to/script.py")) as client:
    # Use client here...

# Connect directly to a FastMCP server object in the same process
from your_app import mcp_server
async with Client(FastMCPTransport(mcp_server)) as client:
    # Use client here...

Common transport options include:

  • SSETransport: Connect to a server via Server-Sent Events (HTTP)
  • PythonStdioTransport: Run a Python script and communicate via stdio
  • FastMCPTransport: Connect directly to a FastMCP server object
  • WSTransport: Connect via WebSockets

In addition, if you pass a connection string or FastMCP server object to the Client constructor, it will try to automatically detect the appropriate transport.

LLM Sampling

Sampling is an MCP feature that allows a server to request a completion from the client LLM, enabling sophisticated use cases while maintaining security and privacy on the server.

import marvin  # Or any other LLM client
from fastmcp import Client, Context, FastMCP
from fastmcp.client.sampling import RequestContext, SamplingMessage, SamplingParams

# -- SERVER SIDE --
# Create a server that requests LLM completions from the client

mcp = FastMCP("Sampling Example")

@mcp.tool()
async def generate_poem(topic: str, context: Context) -> str:
    """Generate a short poem about the given topic."""
    # The server requests a completion from the client LLM
    response = await context.sample(
        f"Write a short poem about {topic}",
        system_prompt="You are a talented poet who writes concise, evocative verses."
    )
    return response.text

@mcp.tool()
async def summarize_document(document_uri: str, context: Context) -> str:
    """Summarize a document using client-side LLM capabilities."""
    # First read the document as a resource
    doc_resource = await context.read_resource(document_uri)
    doc_content = doc_resource[0].content  # Assuming single text content
    
    # Then ask the client LLM to summarize it
    response = await context.sample(
        f"Summarize the following document:\n\n{doc_content}",
        system_prompt="You are an expert summarizer. Create a concise summary."
    )
    return response.text

# -- CLIENT SIDE --
# Create a client that handles the sampling requests

async def sampling_handler(
    messages: list[SamplingMessage],
    params: SamplingParams,
    ctx: RequestContext,
) -> str:
    """Handle sampling requests from the server using your preferred LLM."""
    # Extract the messages and system prompt
    prompt = [m.content.text for m in messages if m.content.type == "text"]
    system_instruction = params.systemPrompt or "You are a helpful assistant."
    
    # Use your preferred LLM client to generate completions
    return await marvin.say_async(
        message=prompt,
        instructions=system_instruction,
    )

# Connect them together
async with Client(mcp, sampling_handler=sampling_handler) as client:
    result = await client.call_tool("generate_poem", {"topic": "autumn leaves"})
    print(result.content[0].text)

This pattern is powerful because:

  1. The server can delegate text generation to the client LLM
  2. The server remains focused on business logic and data handling
  3. The client maintains control over which LLM is used and how requests are handled
  4. No sensitive data needs to be sent to external APIs

Roots Access

FastMCP exposes the MCP roots functionality, allowing clients to specify which file system roots they can access. This creates a secure boundary for tools that need to work with files. Note that the server must account for client roots explicitly.

from fastmcp import Client, RootsList

# Specify file roots that the client can access
roots = ["file:///path/to/allowed/directory"]

async with Client(mcp_server, roots=roots) as client:
    # Now tools in the MCP server can access files in the specified roots
    await client.call_tool("process_file", {"filename": "data.csv"})

Advanced Features

Building on the core concepts, FastMCP v2 introduces powerful features for more complex scenarios:

Proxy Servers

Create a FastMCP server that acts as an intermediary, proxying requests to another MCP endpoint (which could be a server or another client connection).

Use Cases:

  • Transport Conversion: Expose a server running on Stdio (like many local tools) over SSE or WebSockets, making it accessible to web clients or Claude Desktop.
  • Adding Functionality: Wrap an existing server to add authentication, request logging, or modified tool behavior.
  • Aggregating Servers: Combine multiple backend MCP servers behind a single proxy interface (though mount might be simpler for this).
import asyncio
from fastmcp import FastMCP, Client
from fastmcp.client.transports import PythonStdioTransport

# Create a client that connects to the original server
proxy_client = Client(
    transport=PythonStdioTransport('path/to/original_stdio_server.py'),
)

# Create a proxy server that connects to the client and exposes its capabilities
proxy = FastMCP.from_client(proxy_client, name="Stdio-to-SSE Proxy")

if __name__ == "__main__":
    proxy.run(transport='sse')

FastMCP.from_client is a class method that connects to the target, discovers its capabilities, and dynamically builds the proxy server instance.

Composing MCP Servers

Structure larger MCP applications by creating modular FastMCP servers and "mounting" them onto a parent server. This automatically handles prefixing for tool names and resource URIs, preventing conflicts.

from fastmcp import FastMCP

# --- Weather MCP ---
weather_mcp = FastMCP("Weather Service")

@weather_mcp.tool()
def get_forecast(city: str): 
    return f"Sunny in {city}"

@weather_mcp.resource("data://temp/{city}")
def get_temp(city: str): 
    return 25.0

# --- News MCP ---
news_mcp = FastMCP("News Service")

@news_mcp.tool()
def fetch_headlines():
    return ["Big news!", "Other news"]

@news_mcp.resource("data://latest_story")
def get_story():
    return "A story happened."

# --- Composite MCP ---

mcp = FastMCP("Composite")

# Mount sub-apps with prefixes
mcp.mount("weather", weather_mcp) # Tools prefixed "weather/", resources prefixed "weather+"
mcp.mount("news", news_mcp)       # Tools prefixed "news/", resources prefixed "news+"

@mcp.tool()
def ping(): 
    return "Composite OK"


if __name__ == "__main__":
    mcp.run()

This promotes code organization and reusability for complex MCP systems.

OpenAPI & FastAPI Generation

Leverage your existing web APIs by automatically generating FastMCP servers from them.

By default, the following rules are applied:

  • GET requests -> MCP resources
  • GET requests with path parameters -> MCP resource templates
  • All other HTTP methods -> MCP tools

You can override these rules to customize or even ignore certain endpoints.

From FastAPI:

from fastapi import FastAPI
from fastmcp import FastMCP

# Your existing FastAPI application
fastapi_app = FastAPI(title="My Existing API")

@fastapi_app.get("/status")
def get_status(): 
    return {"status": "running"}

@fastapi_app.post("/items")
def create_item(name: str, price: float): 
    return {"id": 1, "name": name, "price": price}

# Generate an MCP server directly from the FastAPI app
mcp_server = FastMCP.from_fastapi(fastapi_app)

if __name__ == "__main__":
    mcp_server.run()

From an OpenAPI Specification:

import httpx
import json
from fastmcp import FastMCP

# Load the OpenAPI spec (dict)
# with open("my_api_spec.json", "r") as f:
#     openapi_spec = json.load(f)
openapi_spec = { ... } # Your spec dict

# Create an HTTP client to make requests to the actual API endpoint
http_client = httpx.AsyncClient(base_url="https://api.yourservice.com")

# Generate the MCP server
mcp_server = FastMCP.from_openapi(openapi_spec, client=http_client)

if __name__ == "__main__":
    mcp_server.run()

Handling stderr

The MCP spec allows for the server to write anything it wants to stderr, and it doesn't specify the format in any way. FastMCP will forward the server's stderr to the client's stderr.

Running Your Server

Choose the method that best suits your needs:

Development Mode (Recommended for Building & Testing)

Use fastmcp dev for an interactive testing environment with the MCP Inspector.

fastmcp dev your_server_file.py
# With temporary dependencies
fastmcp dev your_server_file.py --with pandas --with numpy
# With local package in editable mode
fastmcp dev your_server_file.py --with-editable .

Claude Desktop Integration (For Regular Use)

Use fastmcp install to set up your server for persistent use within the Claude Desktop app. It handles creating an isolated environment using uv.

fastmcp install your_server_file.py
# With a custom name in Claude
fastmcp install your_server_file.py --name "My Analysis Tool"
# With extra packages and environment variables
fastmcp install server.py --with requests -v API_KEY=123 -f .env

Direct Execution (For Advanced Use Cases)

Run your server script directly for custom deployments or integrations outside of Claude. You manage the environment and dependencies yourself.

Add to your your_server_file.py:

if __name__ == "__main__":
    mcp.run() # Assuming 'mcp' is your FastMCP instance

Run with:

python your_server_file.py
# or
uv run python your_server_file.py

Server Object Names

If your FastMCP instance is not named mcp, server, or app, specify it using file:object syntax for the dev and install commands:

fastmcp dev my_module.py:my_mcp_instance
fastmcp install api.py:api_app

Examples

Explore the examples/ directory for code samples demonstrating various features:

  • simple_echo.py: Basic tool, resource, and prompt.
  • complex_inputs.py: Using Pydantic models for tool inputs.
  • mount_example.py: Mounting multiple FastMCP servers.
  • sampling.py: Using LLM completions within your MCP server.
  • screenshot.py: Tool returning an Image object.
  • text_me.py: Tool interacting with an external API.
  • memory.py: More complex example with database interaction.

Contributing

Contributions make the open-source community vibrant! We welcome improvements and features.

Open Developer Guide

Prerequisites

  • Python 3.10+
  • uv

Setup

  1. Clone: git clone https://github.com/jlowin/fastmcp.git && cd fastmcp
  2. Install Env & Dependencies: uv venv && uv sync (Activate the .venv after creation)

Testing

Run the test suite:

uv run pytest -vv

Formatting & Linting

We use ruff via pre-commit.

  1. Install hooks: pre-commit install
  2. Run checks: pre-commit run --all-files

Pull Requests

  1. Fork the repository.
  2. Create a feature branch.
  3. Make changes, commit, and push to your fork.
  4. Open a pull request against the main branch of jlowin/fastmcp.

Please open an issue or discussion for questions or suggestions!

December 31, 1969  23:59:59

CVE cache of the official CVE List in CVE JSON 5 format


Note 2025-04-02 CVE Services Maintenance April 2 beginning at approximately 1:00 PM (until approximately 5:00 PM EDT): The CVE Program will be performing maintenance on CVE Services on April 2 between 1 and 5 PM EDT. This maintenance action will have no impact on existing CVE Records in the CVE List. Although the CVE Repository will be available during this time, it will not be updated with any newly published records until the maintenance is complete.

Note 2024-02-13 CVE Repository Maintenance February 13 beginning at approximately 9:30AM EST: The CVE Program will be performing maintenance on the CVE List repository to correct inaccuracies in the DATE fields of approximately 3,500 CVE Records (i.e., in the date published, date reserved, date updated, and date modified fields). Only DATE fields will be updated during this maintenance. The repository will remain operational during this time.

Note 2024-12-4 CVE REST Services was updated to use the CVE Record Format Schema 5.1.1 on Wednesday 12/4 at 4.00PM EST: This update introduces “non breaking” changes containing new features that some CNAs may be interested in using in the future. (see CVE Record Format version 5.1.1 Release notes. As a fully backwards compatible update (meaning that all previously published CVE Records will validate using this schema), most users will see no operational impact as a result of this change. This schema defines the data format for CVE Records, regardless of whether they were published before or after December 4.

Note 2024-09-17 CVE Repository Historical Record Correction: CVE Records originally published prior to 2023 with incorrect Reserved/Published/Update dates have been corrected. This action corrected approximately 27,000 records that had been assigned incorrect Reserved, Published or Updated dates as part of JSON 5.0 CVE Record adoption.

Note 2024-07-31 CVE Records may now contain a new container called the CVE Program Container: This new container provides additional information added by the CVE Program to include Program-added references. Users of this repository may need to process two containers. See below for more information.

Note 2024-05-08 5:30pm: CVE REST Services was updated to the CVE Record Format Schema 5.1 on 2024-05-08 at 5:30pm EDT. With this update, a CVE Record in this repository may now be either a 5.0 or a 5.1 formatted record. The format is reflected in the the "dataversion" field. Users of this repository who "validate" CVE records are advised to validate records by using the appropriate version of the schema (i.e, 5.0 or 5.1) as reflected in this field. Users should not determine which schema to use based on the deployment date of the new format (i.e., 2024-05-08 at 5:30pm EDT) as there are inconsistencies in published/updated date values.

CVE List V5

This repository is the official CVE List. It is a catalog of all CVE Records identified by, or reported to, the CVE Program.

This repository hosts downloadable files of CVE Records in the CVE Record Format (view the schema). They are updated regularly (about every 7 minutes) using the official CVE Services API. You may search, download, and use the content hosted in this repository, per the CVE Program Terms of Use.

Legacy Format Downloads No Longer Supported—All support for the legacy CVE content download formats (i.e., CSV, HTML, XML, and CVRF) ended on June 30, 2024. These legacy download formats, which will no longer be updated and were phased out over the first six months of 2024, have been replaced by this repository as the only supported method for CVE Record downloads. Learn more here.

CVE Record Containers

CVE Records may now consist of multiple containers:

  • A CNA container
  • The CVE Program Container
  • Optional multiple ADP-specific containers

CVE Program Container

All CVE Program-added references after 7/31/2024 for a CVE Record will be stored in the CVE Program Container of that Record. CNA-provided references will continue to be stored in the CNA Container.

The CVE Program Container is implemented in an ADP container format in the CVE Record.

Specific JSON/CVE Record fields that will be in the CVE Program Container are as follows:

  • adp:title field: "CVE Program Container"
  • adp:providerMetadata:shortName:"CVE"
  • adp:references field as described here

References in the CVE Program Container maintain the same format as references in a CNA Container.

The CVE Program container may contain references that have the x_transferred tag. References with this tag were read from the CNA container on 7/31/2024. This is a "one time" copy to maintain the "state" of the CNA reference list as of 7/31/2024. CVE Program-added references after this date will not have the *x_transfered" tag.

In the case of new CVE records created after 7/31/2024, if no Program provided enriched data is added, there will be no CVE Porgram Container associated with the CVE Record.

Implementation Considerations:

Required Containers processing: After 7/31//2024, to retrieve all information about a reported vulnerability in the CVE Repositoyr, tool vendors and community users will need to examine the CVE Record CNA Container and the CVE Program Container (if one exists). These two containers are minimially required to obtain the core information required by the Program. All other ADP constainers remain optional from a Program perspective.

Potential for Duplicate References The possibility of reference duplications is an artifact of having more than one organizatoin providing references in separate locations. Downstream users will have to determine the appropriate way to resovle potential reference duplications between the CNA container and the CVE Program Container.

CISA-ADP Container

The CISA-ADP Container was launched on June 4 to provide value added information for CVE Records going forward, and retroactively to February, 2024.

The CISA ADP is providing three components to enrich CVE Records:

  1. Stakeholder-Specific vulnerability Categorization (SSVC)
  2. Known Exploitable Vulnerabilities (KEV) catalog data
  3. "Vulnrichment" updates (e.g., missing CVSS, CWE, CPE information for CVE Records that meet specific threat characteristics, and for when CNAs do not provide it themselves)

Reference the CISA ADP Process or the CISA Vulnrichment github site for a full description of what information is provided and the format in which it is recorded.

How to Download the CVE List

There are 2 main ways to download CVE Records from this repository:

  1. using git clients — this is the fastest way to keep the CVE List up-to-date using tools most developers are familiar with. For more information, see the git section , below
  2. using the Releases zip files. For more information, see the Releases section, below.

git

Using the git command line tool or any git UI clients is the easiest way to stay up-to-date with the CVE List. To get started, clone this repository: git clone [email protected]:CVEProject/cvelistV5.git. Once cloned, git pull at any time you need to get the latest updates, just like any other GitHub repository.

Releases

This repository includes release versions of all current CVE Records generated from the official CVE Services API. All times are listed in Coordinated Universal Time (UTC). Each release contains a description of CVEs added or updated since the last release, and an Assets section containing the downloads. Note that the zip files are quite large and so will take some time to download.

  • Baseline downloads are issued at the end of each day at midnight and posted under Assets in the following file name format: Year-Month-Day_all_CVEs_at_midnight.zip, (e.g., 2024-04-04_all_CVEs_at_midnight.zip). This file remains unchanged for 24 hours. If you are updating your CVE List using zip files daily (or less frequently), this is the best one to use.
  • Hourly updates are also provided under Assets using the file name format: Year-Month-Day _delta_CVEs_at_Hour 00Z.zip, (e.g., 2024-04-04_delta_CVEs_at_0100Z.zip). This is useful if you need your CVE List to be accurate hourly. Be aware that this file only contains the deltas since the baseline zip file.

Known Issues with the cvelistV5 repository

The CVE Program is currently aware of the following issues with regard to CVE List downloads. These issues are currently being addressed by the CVE Automation Working Group (AWG). Updates or resolutions will be noted here when available.

  1. Updated 9/17/2024: Some CVE Records published prior to 2023 had incorrect publication, reserved, and update date. As of 9/17/2024 this has been corrected.

  2. Added 9/17/2024: Publication and update date discrepancies exist for CVE recrods published by the MITRE CNA-LR between May 8, 2024 and June 7, 2024 (affecting approximately 515 records).
    Users of this repository for CVE metrics (and other publication/udpate data senstive analysis) should be aware of this issue. A fix will be forthcoming.

Reporting Issues

Please use one of the following:

Pull Requests Not Allowed

This repository contains CVE Records published by CVE Program partners. It does not accept pull requests.

Cloning this Repository

You may clone the repository using git clone. However, pull requests will not be accepted.

Help

Please use the CVE Request Web Forms and select “Other” from the dropdown.

December 31, 1969  23:59:59

Featuring powerful AI capabilities and supporting various e-book formats, it makes reading smarter and more focused.


English | 简体中文 | Türkçe


Anx-logo

Anx Reader

License Downloads Featured|HelloGitHub stars

Anx Reader, a thoughtfully crafted e-book reader for book lovers. Featuring powerful AI capabilities and supporting various e-book formats, it makes reading smarter and more focused. With its modern interface design, we're committed to delivering pure reading pleasure.

Cross-Platform iOS/macOS/Windows/Android

Full Sync Reading Progress/Book Files/Highlighted Notes

Multiple AI OpenAI/Claude/Gemini/DeepSeek

Insightful Data Detailed and Intuitive Charts - Weekly/Monthly/Yearly/Heatmap

Notes Multiple Ways to Filter/Sort - Export as TXT, Markdown, CSV

Highly Customizable Styles Line Spacing/Paragraph Spacing/Margins/Fonts/Color Schemes/Page Turning Methods

Practical Features Translation/Search/Quick Ask AI/Write Ideas/Text-to-Speech

OS Source
iOS App Store
macOS Mac App Store GitHub
Windows GitHub
Android GitHub

📚 Rich Format Support

  • Support for major e-book formats: EPUB, MOBI, AZW3, FB2, TXT
  • Perfect parsing for optimal reading experience

☁️ Seamless Sync

  • Cross-device synchronization of reading progress, notes, and books via WebDAV
  • Continue your reading journey anywhere, anytime

🤖 Smart AI Assistant

  • Integration with leading AI services: OpenAI, DeepSeek, Claude, Gemini
  • Intelligent content summarization and reading position recall for enhanced efficiency

🎨 Personalized Reading Experience

  • Carefully designed theme colors with customization options
  • Switch freely between scrolling and pagination modes
  • Import custom fonts to create your personal reading space

📊 Professional Reading Analytics

  • Comprehensive reading statistics
  • Weekly, monthly, and yearly reading reports
  • Intuitive reading heatmap to track every moment of your reading journey

📝 Powerful Note System

  • Flexible text annotation features
  • Export options in TXT, CSV, and Markdown formats
  • Easily organize and share your reading insights

🛠️ Practical Tools

  • Smart TTS reading to rest your eyes
  • Full-text search for quick content location
  • Instant word translation to enhance reading efficiency

💻 Cross-Platform Support

  • Seamless experience on Android / Windows / MacOS / iOS
  • Consistent user interface across devices

TODO

  • UI adaptation for tablets
  • Page-turning animation
  • TTS voice reading
  • Reading fonts
  • Translation
  • Full-text translation
  • Support for more file types (pdf)
  • Support for WebDAV synchronization
  • Support for Linux

I Encountered a Problem, What Should I Do?

Check Troubleshooting

Submit an issue, and we will respond as soon as possible.

Telegram Group: https://t.me/AnxReader

QQ Group:1042905699

Screenshots

Donations

If you like Anx Reader, please consider supporting the project by donating. Your donation will help me maintain and improve the project.

❤️ Donate

Building

Want to build Anx Reader from source? Please follow these steps:

  • Install Flutter.
  • Clone and enter the project directory.
  • Run flutter pub get.
  • Run flutter gen-l10n to generate multi-language files.
  • Run dart run build_runner build --delete-conflicting-outputs to generate the Riverpod code.
  • Run flutter run to launch the application.

You may encounter Flutter version incompatibility issues. Please refer to the Flutter documentation.

Code signing policy

Sponsors

signpath Free code signing on Windows provided by SignPath.io,certficate by SignPath Foundation

License

This project is licensed under the MIT License.

Starting from version 1.1.4, the open source license for the Anx Reader project has been changed from the MIT License to the GNU General Public License version 3 (GPLv3).

After version 1.2.6, the selection and highlight feature has been rewritten, and the open source license has been changed from the GPL-3.0 License to the MIT License. All contributors agree to this change(#116).

Thanks

foliate-js, which is MIT licensed, it used as the ebook renderer. Thanks to the author for providing such a great project.

foliate, which is GPL-3.0 licensed, selection and highlight feature is inspired by this project. But since 1.2.6, the selection and highlight feature has been rewritten.

And many other open source projects, thanks to all the authors for their contributions.

December 31, 1969  23:59:59

[Support 0.48.x](Reset Cursor AI MachineID & Bypass Higher Token Limit) Cursor Ai ,自动重置机器ID , 免费升级使用Pro功能: You've reached your trial request limit. / Too many free trial accounts used on this machine. Please upgrade to pro. We have this limit in place to prevent abuse. Please let us know if you believe this is a mistake.


➤ Cursor Free VIP

Cursor Pro Logo

Release License: CC BY-NC-ND 4.0 Stars Downloads Buy Me a Coffee

yeongpin%2Fcursor-free-vip | Trendshift
Buy Me a Coffee

Support Latest 0.48.x Version | 支持最新 0.48.x 版本

This tool is for educational purposes, currently the repo does not violate any laws. Please support the original project. This tool will not generate any fake email accounts and OAuth access.

Supports Windows, macOS and Linux.

For optimal performance, run with privileges and always stay up to date.

這是一款用於學習和研究的工具,目前 repo 沒有違反任何法律。請支持原作者。 這款工具不會生成任何假的電子郵件帳戶和 OAuth 訪問。

支持 Windows、macOS 和 Linux。

對於最佳性能,請以管理員身份運行並始終保持最新。

new

🔄 Change Log | 更新日志

Watch Change Log | 查看更新日志

✨ Features | 功能特點

  • Support Windows macOS and Linux systems
    支持 Windows、macOS 和 Linux 系統

  • Reset Cursor's configuration
    重置 Cursor 的配置

  • Multi-language support (English, 简体中文, 繁體中文, Vietnamese)
    多語言支持(英文、简体中文、繁體中文、越南語)

💻 System Support | 系統支持

Operating System Architecture Supported
Windows x64, x86
macOS Intel, Apple Silicon
Linux x64, x86, ARM64

👀 How to use | 如何使用

⭐ Auto Run Script | 腳本自動化運行

Linux/macOS

curl -fsSL https://raw.githubusercontent.com/yeongpin/cursor-free-vip/main/scripts/install.sh -o install.sh && chmod +x install.sh && ./install.sh

Archlinux

Install via AUR

yay -S cursor-free-vip-git

Windows

irm https://raw.githubusercontent.com/yeongpin/cursor-free-vip/main/scripts/install.ps1 | iex
⭐ Manual Reset Machine | 手動運行重置機器

Linux/macOS

curl -fsSL https://raw.githubusercontent.com/yeongpin/cursor-free-vip/main/scripts/reset.sh | sudo bash

Windows

irm https://raw.githubusercontent.com/yeongpin/cursor-free-vip/main/scripts/reset.ps1 | iex

If you want to stop the script, please press Ctrl+C
要停止腳本,請按 Ctrl+C

❗ Note | 注意事項

📝 Config | 文件配置 Win / Macos / Linux Path | 路徑 [Documents/.cursor-free-vip/config.ini]

⭐ Config | 文件配置
[Chrome]
# Default Google Chrome Path | 默認Google Chrome 遊覽器路徑
chromepath = C:\Program Files\Google/Chrome/Application/chrome.exe

[Turnstile]
# Handle Turnstile Wait Time | 等待人機驗證時間
handle_turnstile_time = 2
# Handle Turnstile Wait Random Time (must merge 1-3 or 1,3) | 等待人機驗證隨機時間(必須是 1-3 或者 1,3 這樣的組合)
handle_turnstile_random_time = 1-3

[OSPaths]
# Storage Path | 存儲路徑
storage_path = /Users/username/Library/Application Support/Cursor/User/globalStorage/storage.json
# SQLite Path | SQLite路徑
sqlite_path = /Users/username/Library/Application Support/Cursor/User/globalStorage/state.vscdb
# Machine ID Path | 機器ID路徑
machine_id_path = /Users/username/Library/Application Support/Cursor/machineId
# For Linux users: ~/.config/cursor/machineid

[Timing]
# Min Random Time | 最小隨機時間
min_random_time = 0.1
# Max Random Time | 最大隨機時間
max_random_time = 0.8
# Page Load Wait | 頁面加載等待時間
page_load_wait = 0.1-0.8
# Input Wait | 輸入等待時間
input_wait = 0.3-0.8
# Submit Wait | 提交等待時間
submit_wait = 0.5-1.5
# Verification Code Input | 驗證碼輸入等待時間
verification_code_input = 0.1-0.3
# Verification Success Wait | 驗證成功等待時間
verification_success_wait = 2-3
# Verification Retry Wait | 驗證重試等待時間
verification_retry_wait = 2-3
# Email Check Initial Wait | 郵件檢查初始等待時間
email_check_initial_wait = 4-6
# Email Refresh Wait | 郵件刷新等待時間
email_refresh_wait = 2-4
# Settings Page Load Wait | 設置頁面加載等待時間
settings_page_load_wait = 1-2
# Failed Retry Time | 失敗重試時間
failed_retry_time = 0.5-1
# Retry Interval | 重試間隔
retry_interval = 8-12
# Max Timeout | 最大超時時間
max_timeout = 160

[Utils]
# Check Update | 檢查更新
check_update = True
# Show Account Info | 顯示賬號信息
show_account_info = True

[WindowsPaths]
storage_path = C:\Users\yeongpin\AppData\Roaming\Cursor\User\globalStorage\storage.json
sqlite_path = C:\Users\yeongpin\AppData\Roaming\Cursor\User\globalStorage\state.vscdb
machine_id_path = C:\Users\yeongpin\AppData\Roaming\Cursor\machineId
cursor_path = C:\Users\yeongpin\AppData\Local\Programs\Cursor\resources\app
updater_path = C:\Users\yeongpin\AppData\Local\cursor-updater
update_yml_path = C:\Users\yeongpin\AppData\Local\Programs\Cursor\resources\app-update.yml
product_json_path = C:\Users\yeongpin\AppData\Local\Programs\Cursor\resources\app\product.json

[Browser]
default_browser = opera
chrome_path = C:\Program Files\Google\Chrome\Application\chrome.exe
edge_path = C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe
firefox_path = C:\Program Files\Mozilla Firefox\firefox.exe
brave_path = C:\Program Files\BraveSoftware/Brave-Browser/Application/brave.exe
chrome_driver_path = D:\VisualCode\cursor-free-vip-new\drivers\chromedriver.exe
edge_driver_path = D:\VisualCode\cursor-free-vip-new\drivers\msedgedriver.exe
firefox_driver_path = D:\VisualCode\cursor-free-vip-new\drivers\geckodriver.exe
brave_driver_path = D:\VisualCode\cursor-free-vip-new\drivers\chromedriver.exe
opera_path = C:\Users\yeongpin\AppData\Local\Programs\Opera\opera.exe
opera_driver_path = D:\VisualCode\cursor-free-vip-new\drivers\chromedriver.exe

[OAuth]
show_selection_alert = False
timeout = 120
max_attempts = 3
  • Use administrator privileges to run the script
    請使用管理員身份運行腳本

  • Confirm that Cursor is closed before running the script
    請確保在運行腳本前已經關閉 Cursor

  • This tool is only for learning and research purposes
    此工具僅供學習和研究使用

  • Please comply with the relevant software usage terms when using this tool
    使用本工具時請遵守相關軟件使用條款

🚨 Common Issues | 常見問題

如果遇到權限問題,請確保: 此腳本以管理員身份運行
If you encounter permission issues, please ensure: This script is run with administrator privileges
Error 'User is not authorized' This means your account was banned for using temporary (disposal) mail. Ensure using a non-temporary mail service

🤩 Contribution | 貢獻

歡迎提交 Issue 和 Pull Request!



📩 Disclaimer | 免責聲明

本工具僅供學習和研究使用,使用本工具所產生的任何後果由使用者自行承擔。

This tool is only for learning and research purposes, and any consequences arising from the use of this tool are borne by the user.

💰 Buy Me a Coffee | 請我喝杯咖啡

buy_me_a_coffee
buy_me_a_coffee

⭐ Star History | 星星數

Star History Chart

📝 License | 授權

本項目採用 CC BY-NC-ND 4.0 授權。 Please refer to the LICENSE file for details.

December 31, 1969  23:59:59

Toolkit for linearizing PDFs for LLM datasets/training


olmOCR Logo

olmOCR

GitHub License GitHub release Tech Report Demo Discord

A toolkit for training language models to work with PDF documents in the wild.

Try the online demo: https://olmocr.allenai.org/

What is included:

  • A prompting strategy to get really good natural text parsing using ChatGPT 4o - buildsilver.py
  • An side-by-side eval toolkit for comparing different pipeline versions - runeval.py
  • Basic filtering by language and SEO spam removal - filter.py
  • Finetuning code for Qwen2-VL and Molmo-O - train.py
  • Processing millions of PDFs through a finetuned model using Sglang - pipeline.py
  • Viewing Dolma docs created from PDFs - dolmaviewer.py

Installation

Requirements:

  • Recent NVIDIA GPU (tested on RTX 4090, L40S, A100, H100) with at least 20 GB of GPU RAM
  • 30GB of free disk space

You will need to install poppler-utils and additional fonts for rendering PDF images.

Install dependencies (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install poppler-utils ttf-mscorefonts-installer msttcorefonts fonts-crosextra-caladea fonts-crosextra-carlito gsfonts lcdf-typetools

Set up a conda environment and install olmocr

conda create -n olmocr python=3.11
conda activate olmocr

git clone https://github.com/allenai/olmocr.git
cd olmocr

# For CPU-only operations, ex. running benchmarks
pip install -e .

# For actually converting the files with your own GPU
pip install -e .[gpu] --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/

Local Usage Example

For quick testing, try the web demo. To run locally, a GPU is required, as inference is powered by sglang under the hood. Convert a Single PDF:

python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/horribleocr.pdf

Convert an Image file:

python -m olmocr.pipeline ./localworkspace --pdfs random_page.png

Convert Multiple PDFs:

python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/*.pdf

Results will be stored as JSON in ./localworkspace.

Viewing Results

Extracted text is stored as Dolma-style JSONL inside of the ./localworkspace/results directory.

cat localworkspace/results/output_*.jsonl

View results side-by-side with the original PDFs (uses dolmaviewer command):

python -m olmocr.viewer.dolmaviewer localworkspace/results/output_*.jsonl

Now open ./dolma_previews/tests_gnarly_pdfs_horribleocr_pdf.html in your favorite browser.

image

Multi-node / Cluster Usage

If you want to convert millions of PDFs, using multiple nodes running in parallel, then olmOCR supports reading your PDFs from AWS S3, and coordinating work using an AWS S3 output bucket.

For example, you can start this command on your first worker node, and it will set up a simple work queue in your AWS bucket and start converting PDFs.

python -m olmocr.pipeline s3://my_s3_bucket/pdfworkspaces/exampleworkspace --pdfs s3://my_s3_bucket/jakep/gnarly_pdfs/*.pdf

Now on any subsequent nodes, just run this and they will start grabbing items from the same workspace queue.

python -m olmocr.pipeline s3://my_s3_bucket/pdfworkspaces/exampleworkspace

If you are at Ai2 and want to linearize millions of PDFs efficiently using beaker, just add the --beaker flag. This will prepare the workspace on your local machine, and then launch N GPU workers in the cluster to start converting PDFs.

For example:

python -m olmocr.pipeline s3://my_s3_bucket/pdfworkspaces/exampleworkspace --pdfs s3://my_s3_bucket/jakep/gnarly_pdfs/*.pdf --beaker --beaker_gpus 4

Full documentation for the pipeline

python -m olmocr.pipeline --help
usage: pipeline.py [-h] [--pdfs PDFS] [--workspace_profile WORKSPACE_PROFILE] [--pdf_profile PDF_PROFILE] [--pages_per_group PAGES_PER_GROUP]
                   [--max_page_retries MAX_PAGE_RETRIES] [--max_page_error_rate MAX_PAGE_ERROR_RATE] [--workers WORKERS] [--apply_filter] [--stats] [--model MODEL]
                   [--model_max_context MODEL_MAX_CONTEXT] [--model_chat_template MODEL_CHAT_TEMPLATE] [--target_longest_image_dim TARGET_LONGEST_IMAGE_DIM]
                   [--target_anchor_text_len TARGET_ANCHOR_TEXT_LEN] [--beaker] [--beaker_workspace BEAKER_WORKSPACE] [--beaker_cluster BEAKER_CLUSTER]
                   [--beaker_gpus BEAKER_GPUS] [--beaker_priority BEAKER_PRIORITY]
                   workspace

Manager for running millions of PDFs through a batch inference pipeline

positional arguments:
  workspace             The filesystem path where work will be stored, can be a local folder, or an s3 path if coordinating work with many workers, s3://bucket/prefix/

options:
  -h, --help            show this help message and exit
  --pdfs PDFS           Path to add pdfs stored in s3 to the workspace, can be a glob path s3://bucket/prefix/*.pdf or path to file containing list of pdf paths
  --workspace_profile WORKSPACE_PROFILE
                        S3 configuration profile for accessing the workspace
  --pdf_profile PDF_PROFILE
                        S3 configuration profile for accessing the raw pdf documents
  --pages_per_group PAGES_PER_GROUP
                        Aiming for this many pdf pages per work item group
  --max_page_retries MAX_PAGE_RETRIES
                        Max number of times we will retry rendering a page
  --max_page_error_rate MAX_PAGE_ERROR_RATE
                        Rate of allowable failed pages in a document, 1/250 by default
  --workers WORKERS     Number of workers to run at a time
  --apply_filter        Apply basic filtering to English pdfs which are not forms, and not likely seo spam
  --stats               Instead of running any job, reports some statistics about the current workspace
  --model MODEL         List of paths where you can find the model to convert this pdf. You can specify several different paths here, and the script will try to use the
                        one which is fastest to access
  --model_max_context MODEL_MAX_CONTEXT
                        Maximum context length that the model was fine tuned under
  --model_chat_template MODEL_CHAT_TEMPLATE
                        Chat template to pass to sglang server
  --target_longest_image_dim TARGET_LONGEST_IMAGE_DIM
                        Dimension on longest side to use for rendering the pdf pages
  --target_anchor_text_len TARGET_ANCHOR_TEXT_LEN
                        Maximum amount of anchor text to use (characters)
  --beaker              Submit this job to beaker instead of running locally
  --beaker_workspace BEAKER_WORKSPACE
                        Beaker workspace to submit to
  --beaker_cluster BEAKER_CLUSTER
                        Beaker clusters you want to run on
  --beaker_gpus BEAKER_GPUS
                        Number of gpu replicas to run
  --beaker_priority BEAKER_PRIORITY
                        Beaker priority level for the job

Team

olmOCR is developed and maintained by the AllenNLP team, backed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering. To learn more about who specifically contributed to this codebase, see our contributors page.

License

olmOCR is licensed under Apache 2.0. A full copy of the license can be found on GitHub.

Citing

@misc{olmocr,
      title={{olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models}},
      author={Jake Poznanski and Jon Borchardt and Jason Dunkelberger and Regan Huff and Daniel Lin and Aman Rangapur and Christopher Wilhelm and Kyle Lo and Luca Soldaini},
      year={2025},
      eprint={2502.18443},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2502.18443},
}
December 31, 1969  23:59:59

NocoBase is an extensibility-first, open-source no-code/low-code platform for building business applications and enterprise solutions.


English | 中文 | 日本語

https://github.com/user-attachments/assets/a50c100a-4561-4e06-b2d2-d48098659ec0

nocobase%2Fnocobase | Trendshift NocoBase - Scalability-first, open-source no-code platform | Product Hunt

What is NocoBase

NocoBase is an extensibility-first, open-source no-code development platform.
Instead of investing years of time and millions of dollars in research and development, deploy NocoBase in a few minutes and you'll have a private, controllable, and extremely scalable no-code development platform!

Homepage:
https://www.nocobase.com/

Online Demo:
https://demo.nocobase.com/new

Documents:
https://docs.nocobase.com/

Forum:
https://forum.nocobase.com/

Tutorials:
https://www.nocobase.com/en/tutorials

Use Cases:
https://www.nocobase.com/en/blog/tags/customer-stories

Release Notes

Our blog is regularly updated with release notes and provides a weekly summary.

Distinctive features

1. Data model-driven

Most form-, table-, or process-driven no-code products create data structures directly in the user interface, such as Airtable, where adding a new column to a table is adding a new field. This has the advantage of simplicity of use, but the disadvantage of limited functionality and flexibility to meet the needs of more complex scenarios.

NocoBase adopts the design idea of separating the data structure from the user interface, allowing you to create any number of blocks (data views) for the data collections, with different type, styles, content, and actions in each block. This balances the simplicity of no-code operation with the flexibility of native development.

model

2. What you see is what you get

NocoBase enables the development of complex and distinctive business systems, but this does not mean that complex and specialized operations are required. With a single click, configuration options are displayed on the usage interface, and administrators with system configuration privileges can directly configure the user interface in a WYSIWYG manner.

wysiwyg

3. Everything is implemented as plugins

NocoBase adopts plugin architecture, all new functions can be realized by developing and installing plugins, and expanding the functions is as easy as installing an APP on your phone.

plugins

Installation

NocoBase supports three installation methods:

  • Installing With Docker (👍Recommended)

    Suitable for no-code scenarios, no code to write. When upgrading, just download the latest image and reboot.

  • Installing from create-nocobase-app CLI

    The business code of the project is completely independent and supports low-code development.

  • Installing from Git source code

    If you want to experience the latest unreleased version, or want to participate in the contribution, you need to make changes and debug on the source code, it is recommended to choose this installation method, which requires a high level of development skills, and if the code has been updated, you can git pull the latest code.

December 31, 1969  23:59:59

Docmost is an open-source collaborative wiki and documentation software. It is an open-source alternative to Confluence and Notion.


Docmost

Open-source collaborative wiki and documentation software.
Website | Documentation | Twitter / X


Getting started

To get started with Docmost, please refer to our documentation or try our cloud version .

Features

  • Real-time collaboration
  • Diagrams (Draw.io, Excalidraw and Mermaid)
  • Spaces
  • Permissions management
  • Groups
  • Comments
  • Page history
  • Search
  • File attachments
  • Embeds (Airtable, Loom, Miro and more)
  • Translations (10+ languages)

Screenshots

home editor

License

Docmost core is licensed under the open-source AGPL 3.0 license.
Enterprise features are available under an enterprise license (Enterprise Edition).

All files in the following directories are licensed under the Docmost Enterprise license defined in packages/ee/License.

  • apps/server/src/ee
  • apps/client/src/ee
  • packages/ee

Contributing

See the development documentation

Thanks

Special thanks to;

Crowdin

Crowdin for providing access to their localization platform.

Algolia-mark-square-white

Algolia for providing full-text search to the docs.

December 31, 1969  23:59:59

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.


GitHub PyPI Documentation Gurubase
GitHub PyPI Documentation Gurubase

Vanna

Vanna is an MIT-licensed open-source Python RAG (Retrieval-Augmented Generation) framework for SQL generation and related functionality.

https://github.com/vanna-ai/vanna/assets/7146154/1901f47a-515d-4982-af50-f12761a3b2ce

vanna-quadrants

How Vanna works

Screen Recording 2024-01-24 at 11 21 37 AM

Vanna works in two easy steps - train a RAG "model" on your data, and then ask questions which will return SQL queries that can be set up to automatically run on your database.

  1. Train a RAG "model" on your data.
  2. Ask questions.

If you don't know what RAG is, don't worry -- you don't need to know how this works under the hood to use it. You just need to know that you "train" a model, which stores some metadata and then use it to "ask" questions.

See the base class for more details on how this works under the hood.

User Interfaces

These are some of the user interfaces that we've built using Vanna. You can use these as-is or as a starting point for your own custom interface.

Supported LLMs

Supported VectorStores

Supported Databases

Getting started

See the documentation for specifics on your desired database, LLM, etc.

If you want to get a feel for how it works after training, you can try this Colab notebook.

Install

pip install vanna

There are a number of optional packages that can be installed so see the documentation for more details.

Import

See the documentation if you're customizing the LLM or vector database.

# The import statement will vary depending on your LLM and vector database. This is an example for OpenAI + ChromaDB

from vanna.openai.openai_chat import OpenAI_Chat
from vanna.chromadb.chromadb_vector import ChromaDB_VectorStore

class MyVanna(ChromaDB_VectorStore, OpenAI_Chat):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        OpenAI_Chat.__init__(self, config=config)

vn = MyVanna(config={'api_key': 'sk-...', 'model': 'gpt-4-...'})

# See the documentation for other options

Training

You may or may not need to run these vn.train commands depending on your use case. See the documentation for more details.

These statements are shown to give you a feel for how it works.

Train with DDL Statements

DDL statements contain information about the table names, columns, data types, and relationships in your database.

vn.train(ddl="""
    CREATE TABLE IF NOT EXISTS my-table (
        id INT PRIMARY KEY,
        name VARCHAR(100),
        age INT
    )
""")

Train with Documentation

Sometimes you may want to add documentation about your business terminology or definitions.

vn.train(documentation="Our business defines XYZ as ...")

Train with SQL

You can also add SQL queries to your training data. This is useful if you have some queries already laying around. You can just copy and paste those from your editor to begin generating new SQL.

vn.train(sql="SELECT name, age FROM my-table WHERE name = 'John Doe'")

Asking questions

vn.ask("What are the top 10 customers by sales?")

You'll get SQL

SELECT c.c_name as customer_name,
        sum(l.l_extendedprice * (1 - l.l_discount)) as total_sales
FROM   snowflake_sample_data.tpch_sf1.lineitem l join snowflake_sample_data.tpch_sf1.orders o
        ON l.l_orderkey = o.o_orderkey join snowflake_sample_data.tpch_sf1.customer c
        ON o.o_custkey = c.c_custkey
GROUP BY customer_name
ORDER BY total_sales desc limit 10;

If you've connected to a database, you'll get the table:

CUSTOMER_NAME TOTAL_SALES
0 Customer#000143500 6757566.0218
1 Customer#000095257 6294115.3340
2 Customer#000087115 6184649.5176
3 Customer#000131113 6080943.8305
4 Customer#000134380 6075141.9635
5 Customer#000103834 6059770.3232
6 Customer#000069682 6057779.0348
7 Customer#000102022 6039653.6335
8 Customer#000098587 6027021.5855
9 Customer#000064660 5905659.6159

You'll also get an automated Plotly chart:

RAG vs. Fine-Tuning

RAG

  • Portable across LLMs
  • Easy to remove training data if any of it becomes obsolete
  • Much cheaper to run than fine-tuning
  • More future-proof -- if a better LLM comes out, you can just swap it out

Fine-Tuning

  • Good if you need to minimize tokens in the prompt
  • Slow to get started
  • Expensive to train and run (generally)

Why Vanna?

  1. High accuracy on complex datasets.
    • Vanna’s capabilities are tied to the training data you give it
    • More training data means better accuracy for large and complex datasets
  2. Secure and private.
    • Your database contents are never sent to the LLM or the vector database
    • SQL execution happens in your local environment
  3. Self learning.
    • If using via Jupyter, you can choose to "auto-train" it on the queries that were successfully executed
    • If using via other interfaces, you can have the interface prompt the user to provide feedback on the results
    • Correct question to SQL pairs are stored for future reference and make the future results more accurate
  4. Supports any SQL database.
    • The package allows you to connect to any SQL database that you can otherwise connect to with Python
  5. Choose your front end.
    • Most people start in a Jupyter Notebook.
    • Expose to your end users via Slackbot, web app, Streamlit app, or a custom front end.

Extending Vanna

Vanna is designed to connect to any database, LLM, and vector database. There's a VannaBase abstract base class that defines some basic functionality. The package provides implementations for use with OpenAI and ChromaDB. You can easily extend Vanna to use your own LLM or vector database. See the documentation for more details.

Vanna in 100 Seconds

https://github.com/vanna-ai/vanna/assets/7146154/eb90ee1e-aa05-4740-891a-4fc10e611cab

More resources

December 31, 1969  23:59:59

AI personal assistant for email. Open source app to help you reach inbox zero fast.


Inbox Zero - Your AI Email Assistant

Open source email app to reach inbox zero fast.
Website · Discord · Issues

About

There are two parts to Inbox Zero:

  1. An AI email assistant that helps you spend less time on email.
  2. Open source AI email client.

If you're looking to contribue to the project, the email client is the best place to do this.

Deploy with Vercel

Thanks to Vercel for sponsoring Inbox Zero in support of open-source software.

Features

  • AI Personal Assistant: Manages your email for you based on a plain text prompt file. It can take any action a human assistant can take on your behalf (Draft reply, Label, Archive, Reply, Forward, Mark Spam, and even call a webhook).
  • Reply Zero: Track emails that need your reply and those awaiting responses.
  • Smart Categories: Categorize everyone that's ever emailed you.
  • Bulk Unsubscriber: Quickly unsubscribe from emails you never read in one-click.
  • Cold Email Blocker: Automatically block cold emails.
  • Email Analytics: Track your email activity with daily, weekly, and monthly stats.

Learn more in our docs.

Feature Screenshots

AI Assistant Reply Zero
AI Assistant Reply Zero
Gmail Client Bulk Unsubscriber
Gmail client Bulk Unsubscriber

Demo Video

Inbox Zero demo

Built with

Star History

Star History Chart

Feature Requests

To request a feature open a GitHub issue, or join our Discord.

Getting Started for Developers

We offer a hosted version of Inbox Zero at https://getinboxzero.com. To self-host follow the steps below.

Contributing to the project

You can view open tasks in our GitHub Issues. Join our Discord to discuss tasks and check what's being worked on.

ARCHITECTURE.md explains the architecture of the project (LLM generated).

Requirements

Setup

Here's a video on how to set up the project. It covers the same steps mentioned in this document. But goes into greater detail on setting up the external services.

The external services that are required are:

You also need to set an LLM, but you can use a local one too:

  • Anthropic
  • OpenAI
  • AWS Bedrock Anthropic
  • Google Gemini
  • OpenRouter (any model)
  • Groq (Llama 3.3 70B)
  • Ollama (local)

Create your own .env file:

cp apps/web/.env.example apps/web/.env
cd apps/web
pnpm install

For self-hosting, you may also need to copy the .env file to both the root directory AND the apps/web directory (PRs welcome to improve this):

cp apps/web/.env .env

Set the environment variables in the newly created .env. You can see a list of required variables in: apps/web/env.ts.

The required environment variables:

  • NEXTAUTH_SECRET -- can be any random string (try using openssl rand -hex 32 for a quick secure random string)
  • GOOGLE_CLIENT_ID -- Google OAuth client ID. More info here
  • GOOGLE_CLIENT_SECRET -- Google OAuth client secret. More info here
  • GOOGLE_ENCRYPT_SECRET -- Secret key for encrypting OAuth tokens (try using openssl rand -hex 32 for a secure key)
  • GOOGLE_ENCRYPT_SALT -- Salt for encrypting OAuth tokens (try using openssl rand -hex 16 for a secure salt)
  • UPSTASH_REDIS_URL -- Redis URL from Upstash. (can be empty if you are using Docker Compose)
  • UPSTASH_REDIS_TOKEN -- Redis token from Upstash. (or specify your own random string if you are using Docker Compose)

We use Postgres for the database. For Redis, you can use Upstash Redis or set up your own Redis instance.

You can run Postgres & Redis locally using docker-compose

docker-compose up -d # -d will run the services in the background

When using Vercel with Fluid Compute turned off, you should set MAX_DURATION=300 or lower. See Vercel limits for different plans here.

To run the migrations:

pnpm prisma migrate dev

To run the app locally for development (slower):

pnpm run dev

Or from the project root:

turbo dev

To build and run the app locally in production mode (faster):

pnpm run build
pnpm start

Open http://localhost:3000 to view the app in your browser.

To upgrade yourself, make yourself an admin in the .env: [email protected] Then upgrade yourself at: http://localhost:3000/admin.

Supported LLMs

For the LLM, you can use Anthropic, OpenAI, or Anthropic on AWS Bedrock. You can also use Ollama by setting the following enviroment variables:

DEFAULT_LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434/api
NEXT_PUBLIC_OLLAMA_MODEL=gemma3  # or whatever available model you want to use

Note: If you need to access Ollama hosted locally and the application is running on Docker setup, you can use http://host.docker.internal:11434/api as the base URL. You might also need to set OLLAMA_HOST to 0.0.0.0 in the Ollama configuration file.

You can select the model you wish to use in the app on the /settings page of the app.

Setting up Google OAuth and Gmail API

  1. Create a Project in Google Cloud Console

  2. Enable Required APIs

  3. Configure the OAuth Consent Screen

    • Go to 'APIs & Services' > 'OAuth consent screen'
    • Choose 'External' user type (or 'Internal' if you have Google Workspace)
    • Fill in required app information
    • Add the following scopes:
      https://www.googleapis.com/auth/userinfo.profile
      https://www.googleapis.com/auth/userinfo.email
      https://www.googleapis.com/auth/gmail.modify
      https://www.googleapis.com/auth/gmail.settings.basic
      https://www.googleapis.com/auth/contacts
      
    • Add yourself as a test user under 'Test users' section
  4. Create OAuth 2.0 Credentials

    • Go to 'APIs & Services' > 'Credentials'
    • Click 'Create Credentials' > 'OAuth client ID'
    • Select 'Web application' type
    • Add authorized JavaScript origins:
      • Development: http://localhost:3000
      • Production: https://your-production-url.com
    • Add authorized redirect URIs:
      • Development:
        http://localhost:3000/api/auth/callback/google
        
      • Production:
        https://your-production-url.com/api/auth/callback/google
        
  5. Configure Environment Variables

    • Add to your .env file:
      GOOGLE_CLIENT_ID=your_client_id
      GOOGLE_CLIENT_SECRET=your_client_secret
      

Set up push notifications via Google PubSub to handle emails in real time

Follow instructions here.

  1. Create a topic
  2. Create a subscription
  3. Grant publish rights on your topic

Set env var GOOGLE_PUBSUB_TOPIC_NAME. When creating the subscription select Push and the url should look something like: https://www.getinboxzero.com/api/google/webhook?token=TOKEN or https://abc.ngrok-free.app/api/google/webhook?token=TOKEN where the domain is your domain. Set GOOGLE_PUBSUB_VERIFICATION_TOKEN in your .env file to be the value of TOKEN.

To run in development ngrok can be helpful:

ngrok http 3000
# or with an ngrok domain to keep your endpoint stable (set `XYZ`):
ngrok http --domain=XYZ.ngrok-free.app 3000

And then update the webhook endpoint in the Google PubSub subscriptions dashboard.

To start watching emails visit: /api/google/watch/all

Watching for email updates

Set a cron job to run these: The Google watch is necessary. Others are optional.

  "crons": [
    {
      "path": "/api/google/watch/all",
      "schedule": "0 1 * * *"
    },
    {
      "path": "/api/resend/summary/all",
      "schedule": "0 16 * * 1"
    },
    {
      "path": "/api/reply-tracker/disable-unused-auto-draft",
      "schedule": "0 3 * * *"
    }
  ]

Here are some easy ways to run cron jobs. Upstash is a free, easy option. I could never get the Vercel vercel.json. Open to PRs if you find a fix for that.

December 31, 1969  23:59:59

Python tool for converting files and office documents to Markdown.


MarkItDown

PyPI PyPI - Downloads Built by AutoGen Team

[!TIP] MarkItDown now offers an MCP (Model Context Protocol) server for integration with LLM applications like Claude Desktop. See markitdown-mcp for more information.

[!IMPORTANT] Breaking changes between 0.0.1 to 0.1.0:

  • Dependencies are now organized into optional feature-groups (further details below). Use pip install 'markitdown[all]' to have backward-compatible behavior.
  • convert_stream() now requires a binary file-like object (e.g., a file opened in binary mode, or an io.BytesIO object). This is a breaking change from the previous version, where it previously also accepted text file-like objects, like io.StringIO.
  • The DocumentConverter class interface has changed to read from file-like streams rather than file paths. No temporary files are created anymore. If you are the maintainer of a plugin, or custom DocumentConverter, you likely need to update your code. Otherwise, if only using the MarkItDown class or CLI (as in these examples), you should not need to change anything.

MarkItDown is a lightweight Python utility for converting various files to Markdown for use with LLMs and related text analysis pipelines. To this end, it is most comparable to textract, but with a focus on preserving important document structure and content as Markdown (including: headings, lists, tables, links, etc.) While the output is often reasonably presentable and human-friendly, it is meant to be consumed by text analysis tools -- and may not be the best option for high-fidelity document conversions for human consumption.

At present, MarkItDown supports:

  • PDF
  • PowerPoint
  • Word
  • Excel
  • Images (EXIF metadata and OCR)
  • Audio (EXIF metadata and speech transcription)
  • HTML
  • Text-based formats (CSV, JSON, XML)
  • ZIP files (iterates over contents)
  • Youtube URLs
  • EPubs
  • ... and more!

Why Markdown?

Markdown is extremely close to plain text, with minimal markup or formatting, but still provides a way to represent important document structure. Mainstream LLMs, such as OpenAI's GPT-4o, natively "speak" Markdown, and often incorporate Markdown into their responses unprompted. This suggests that they have been trained on vast amounts of Markdown-formatted text, and understand it well. As a side benefit, Markdown conventions are also highly token-efficient.

Installation

To install MarkItDown, use pip: pip install 'markitdown[all]'. Alternatively, you can install it from the source:

git clone [email protected]:microsoft/markitdown.git
cd markitdown
pip install -e 'packages/markitdown[all]'

Usage

Command-Line

markitdown path-to-file.pdf > document.md

Or use -o to specify the output file:

markitdown path-to-file.pdf -o document.md

You can also pipe content:

cat path-to-file.pdf | markitdown

Optional Dependencies

MarkItDown has optional dependencies for activating various file formats. Earlier in this document, we installed all optional dependencies with the [all] option. However, you can also install them individually for more control. For example:

pip install 'markitdown[pdf, docx, pptx]'

will install only the dependencies for PDF, DOCX, and PPTX files.

At the moment, the following optional dependencies are available:

  • [all] Installs all optional dependencies
  • [pptx] Installs dependencies for PowerPoint files
  • [docx] Installs dependencies for Word files
  • [xlsx] Installs dependencies for Excel files
  • [xls] Installs dependencies for older Excel files
  • [pdf] Installs dependencies for PDF files
  • [outlook] Installs dependencies for Outlook messages
  • [az-doc-intel] Installs dependencies for Azure Document Intelligence
  • [audio-transcription] Installs dependencies for audio transcription of wav and mp3 files
  • [youtube-transcription] Installs dependencies for fetching YouTube video transcription

Plugins

MarkItDown also supports 3rd-party plugins. Plugins are disabled by default. To list installed plugins:

markitdown --list-plugins

To enable plugins use:

markitdown --use-plugins path-to-file.pdf

To find available plugins, search GitHub for the hashtag #markitdown-plugin. To develop a plugin, see packages/markitdown-sample-plugin.

Azure Document Intelligence

To use Microsoft Document Intelligence for conversion:

markitdown path-to-file.pdf -o document.md -d -e "<document_intelligence_endpoint>"

More information about how to set up an Azure Document Intelligence Resource can be found here

Python API

Basic usage in Python:

from markitdown import MarkItDown

md = MarkItDown(enable_plugins=False) # Set to True to enable plugins
result = md.convert("test.xlsx")
print(result.text_content)

Document Intelligence conversion in Python:

from markitdown import MarkItDown

md = MarkItDown(docintel_endpoint="<document_intelligence_endpoint>")
result = md.convert("test.pdf")
print(result.text_content)

To use Large Language Models for image descriptions, provide llm_client and llm_model:

from markitdown import MarkItDown
from openai import OpenAI

client = OpenAI()
md = MarkItDown(llm_client=client, llm_model="gpt-4o")
result = md.convert("example.jpg")
print(result.text_content)

Docker

docker build -t markitdown:latest .
docker run --rm -i markitdown:latest < ~/your-file.pdf > output.md

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

How to Contribute

You can help by looking at issues or helping review PRs. Any issue or PR is welcome, but we have also marked some as 'open for contribution' and 'open for reviewing' to help facilitate community contributions. These are ofcourse just suggestions and you are welcome to contribute in any way you like.

All Especially Needs Help from Community
Issues All Issues Issues open for contribution
PRs All PRs PRs open for reviewing

Running Tests and Checks

  • Navigate to the MarkItDown package:

    cd packages/markitdown
    
  • Install hatch in your environment and run tests:

    pip install hatch  # Other ways of installing hatch: https://hatch.pypa.io/dev/install/
    hatch shell
    hatch test
    

    (Alternative) Use the Devcontainer which has all the dependencies installed:

    # Reopen the project in Devcontainer and run:
    hatch test
    
  • Run pre-commit checks before submitting a PR: pre-commit run --all-files

Contributing 3rd-party Plugins

You can also contribute by creating and sharing 3rd party plugins. See packages/markitdown-sample-plugin for more details.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.