Github Trending - Weekly
Github weekly trending
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
📌 Introducing Dify Workflow File Upload: Recreate Google NotebookLM Podcast
Dify Cloud · Self-hosting · Documentation · Enterprise inquiry
Dify is an open-source LLM app development platform. Its intuitive interface combines agentic AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
Quick start
Before installing Dify, make sure your machine meets the following minimum system requirements:
- CPU >= 2 Core
- RAM >= 4 GiB
The easiest way to start the Dify server is through docker compose. Before running Dify with the following commands, make sure that Docker and Docker Compose are installed on your machine:
cd dify
cd docker
cp .env.example .env
docker compose up -d
After running, you can access the Dify dashboard in your browser at http://localhost/install and start the initialization process.
Seeking help
Please refer to our FAQ if you encounter problems setting up Dify. Reach out to the community and us if you are still having issues.
If you'd like to contribute to Dify or do additional development, refer to our guide to deploying from source code
Key features
1. Workflow: Build and test powerful AI workflows on a visual canvas, leveraging all the following features and beyond.
https://github.com/langgenius/dify/assets/13230914/356df23e-1604-483d-80a6-9517ece318aa
2. Comprehensive model support: Seamless integration with hundreds of proprietary / open-source LLMs from dozens of inference providers and self-hosted solutions, covering GPT, Mistral, Llama3, and any OpenAI API-compatible models. A full list of supported model providers can be found here.
3. Prompt IDE: Intuitive interface for crafting prompts, comparing model performance, and adding additional features such as text-to-speech to a chat-based app.
4. RAG Pipeline: Extensive RAG capabilities that cover everything from document ingestion to retrieval, with out-of-box support for text extraction from PDFs, PPTs, and other common document formats.
5. Agent capabilities: You can define agents based on LLM Function Calling or ReAct, and add pre-built or custom tools for the agent. Dify provides 50+ built-in tools for AI agents, such as Google Search, DALL·E, Stable Diffusion and WolframAlpha.
6. LLMOps: Monitor and analyze application logs and performance over time. You could continuously improve prompts, datasets, and models based on production data and annotations.
7. Backend-as-a-Service: All of Dify's offerings come with corresponding APIs, so you could effortlessly integrate Dify into your own business logic.
Feature Comparison
Feature | Dify.AI | LangChain | Flowise | OpenAI Assistants API |
---|---|---|---|---|
Programming Approach | API + App-oriented | Python Code | App-oriented | API-oriented |
Supported LLMs | Rich Variety | Rich Variety | Rich Variety | OpenAI-only |
RAG Engine | ✅ | ✅ | ✅ | ✅ |
Agent | ✅ | ✅ | ❌ | ✅ |
Workflow | ✅ | ❌ | ✅ | ❌ |
Observability | ✅ | ✅ | ❌ | ❌ |
Enterprise Feature (SSO/Access control) | ✅ | ❌ | ❌ | ❌ |
Local Deployment | ✅ | ✅ | ✅ | ❌ |
Using Dify
-
Cloud
We host a Dify Cloud service for anyone to try with zero setup. It provides all the capabilities of the self-deployed version, and includes 200 free GPT-4 calls in the sandbox plan. -
Self-hosting Dify Community Edition
Quickly get Dify running in your environment with this starter guide. Use our documentation for further references and more in-depth instructions. -
Dify for enterprise / organizations
We provide additional enterprise-centric features. Log your questions for us through this chatbot or send us an email to discuss enterprise needs.For startups and small businesses using AWS, check out Dify Premium on AWS Marketplace and deploy it to your own AWS VPC with one-click. It's an affordable AMI offering with the option to create apps with custom logo and branding.
Staying ahead
Star Dify on GitHub and be instantly notified of new releases.
Advanced Setup
If you need to customize the configuration, please refer to the comments in our .env.example file and update the corresponding values in your .env
file. Additionally, you might need to make adjustments to the docker-compose.yaml
file itself, such as changing image versions, port mappings, or volume mounts, based on your specific deployment environment and requirements. After making any changes, please re-run docker-compose up -d
. You can find the full list of available environment variables here.
If you'd like to configure a highly-available setup, there are community-contributed Helm Charts and YAML files which allow Dify to be deployed on Kubernetes.
Using Terraform for Deployment
Deploy Dify to Cloud Platform with a single click using terraform
Azure Global
Google Cloud
Using AWS CDK for Deployment
Deploy Dify to AWS with CDK
AWS
Contributing
For those who'd like to contribute code, see our Contribution Guide. At the same time, please consider supporting Dify by sharing it on social media and at events and conferences.
We are looking for contributors to help with translating Dify to languages other than Mandarin or English. If you are interested in helping, please see the i18n README for more information, and leave us a comment in the
global-users
channel of our Discord Community Server.
Community & contact
- Github Discussion. Best for: sharing feedback and asking questions.
- GitHub Issues. Best for: bugs you encounter using Dify.AI, and feature proposals. See our Contribution Guide.
- Discord. Best for: sharing your applications and hanging out with the community.
- X(Twitter). Best for: sharing your applications and hanging out with the community.
Contributors
Star history
Security disclosure
To protect your privacy, please avoid posting security issues on GitHub. Instead, send your questions to [email protected] and we will provide you with a more detailed answer.
License
This repository is available under the Dify Open Source License, which is essentially Apache 2.0 with a few additional restrictions.
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
MoneyPrinterTurbo 💸
简体中文 | English
只需提供一个视频 主题 或 关键词 ,就可以全自动生成视频文案、视频素材、视频字幕、视频背景音乐,然后合成一个高清的短视频。
Web界面
API界面
特别感谢 🙏
由于该项目的 部署 和 使用,对于一些小白用户来说,还是 有一定的门槛,在此特别感谢 录咖(AI智能 多媒体服务平台) 网站基于该项目,提供的免费AI视频生成器
服务,可以不用部署,直接在线使用,非常方便。
感谢赞助 🙏
感谢佐糖 https://picwish.cn 对该项目的支持和赞助,使得该项目能够持续的更新和维护。
佐糖专注于图像处理领域,提供丰富的图像处理工具,将复杂操作极致简化,真正实现让图像处理更简单。
功能特性 🎯
- 完整的 MVC架构,代码 结构清晰,易于维护,支持
API
和Web界面
- 支持视频文案 AI自动生成,也可以自定义文案
- 支持多种 高清视频 尺寸
- 竖屏 9:16,
1080x1920
- 横屏 16:9,
1920x1080
- 竖屏 9:16,
- 支持 批量视频生成,可以一次生成多个视频,然后选择一个最满意的
- 支持 视频片段时长 设置,方便调节素材切换频率
- 支持 中文 和 英文 视频文案
- 支持 多种语音 合成,可 实时试听 效果
- 支持 字幕生成,可以调整
字体
、位置
、颜色
、大小
,同时支持字幕描边
设置 - 支持 背景音乐,随机或者指定音乐文件,可设置
背景音乐音量
- 视频素材来源 高清,而且 无版权,也可以使用自己的 本地素材
- 支持 OpenAI、Moonshot、Azure、gpt4free、one-api、通义千问、Google Gemini、Ollama、 DeepSeek、 文心一言 等多种模型接入
- 中国用户建议使用 DeepSeek 或 Moonshot 作为大模型提供商(国内可直接访问,不需要VPN。注册就送额度,基本够用)
后期计划 📅
- GPT-SoVITS 配音支持
- 优化语音合成,利用大模型,使其合成的声音,更加自然,情绪更加丰富
- 增加视频转场效果,使其看起来更加的流畅
- 增加更多视频素材来源,优化视频素材和文案的匹配度
- 增加视频长度选项:短、中、长
- 支持更多的语音合成服务商,比如 OpenAI TTS
- 自动上传到YouTube平台
视频演示 📺
竖屏 9:16
|
更真实的合成声音 |
|
---|---|---|
横屏 16:9
|
|
---|---|
配置要求 📦
- 建议最低 CPU 4核或以上,内存 8G 或以上,显卡非必须
- Windows 10 或 MacOS 11.0 以上系统
快速开始 🚀
下载一键启动包,解压直接使用(路径不要有 中文、特殊字符、空格)
Windows
- 百度网盘(1.2.1 最新版本): https://pan.baidu.com/s/1pSNjxTYiVENulTLm6zieMQ?pwd=g36q 提取码: g36q
下载后,建议先双击执行 update.bat
更新到最新代码,然后双击 start.bat
启动
启动后,会自动打开浏览器(如果打开是空白,建议换成 Chrome 或者 Edge 打开)
其他系统
还没有制作一键启动包,看下面的 安装部署 部分,建议使用 docker 部署,更加方便。
安装部署 📥
前提条件
- 尽量不要使用 中文路径,避免出现一些无法预料的问题
- 请确保你的 网络 是正常的,VPN需要打开
全局流量
模式
① 克隆代码
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
② 修改配置文件
- 将
config.example.toml
文件复制一份,命名为config.toml
- 按照
config.toml
文件中的说明,配置好pexels_api_keys
和llm_provider
,并根据 llm_provider 对应的服务商,配置相关的 API Key
Docker部署 🐳
① 启动Docker
如果未安装 Docker,请先安装 https://www.docker.com/products/docker-desktop/
如果是Windows系统,请参考微软的文档:
- https://learn.microsoft.com/zh-cn/windows/wsl/install
- https://learn.microsoft.com/zh-cn/windows/wsl/tutorials/wsl-containers
cd MoneyPrinterTurbo
docker-compose up
注意:最新版的docker安装时会自动以插件的形式安装docker compose,启动命令调整为docker compose up
② 访问Web界面
打开浏览器,访问 http://0.0.0.0:8501
③ 访问API文档
打开浏览器,访问 http://0.0.0.0:8080/docs 或者 http://0.0.0.0:8080/redoc
手动部署 📦
视频教程
- 完整的使用演示:https://v.douyin.com/iFhnwsKY/
- 如何在Windows上部署:https://v.douyin.com/iFyjoW3M
① 创建虚拟环境
建议使用 conda 创建 python 虚拟环境
git clone https://github.com/harry0703/MoneyPrinterTurbo.git
cd MoneyPrinterTurbo
conda create -n MoneyPrinterTurbo python=3.11
conda activate MoneyPrinterTurbo
pip install -r requirements.txt
② 安装好 ImageMagick
-
Windows:
- 下载 https://imagemagick.org/script/download.php 选择Windows版本,切记一定要选择 静态库 版本,比如 ImageMagick-7.1.1-32-Q16-x64-static.exe
- 安装下载好的 ImageMagick,注意不要修改安装路径
- 修改
配置文件 config.toml
中的imagemagick_path
为你的 实际安装路径
-
MacOS:
brew install imagemagick
-
Ubuntu
sudo apt-get install imagemagick
-
CentOS
sudo yum install ImageMagick
③ 启动Web界面 🌐
注意需要到 MoneyPrinterTurbo 项目 根目录
下执行以下命令
Windows
conda activate MoneyPrinterTurbo
webui.bat
MacOS or Linux
conda activate MoneyPrinterTurbo
sh webui.sh
启动后,会自动打开浏览器(如果打开是空白,建议换成 Chrome 或者 Edge 打开)
④ 启动API服务 🚀
python main.py
启动后,可以查看 API文档
http://127.0.0.1:8080/docs 或者 http://127.0.0.1:8080/redoc 直接在线调试接口,快速体验。
语音合成 🗣
所有支持的声音列表,可以查看:声音列表
2024-04-16 v1.1.2 新增了9种Azure的语音合成声音,需要配置API KEY,该声音合成的更加真实。
字幕生成 📜
当前支持2种字幕生成方式:
- edge: 生成
速度快
,性能更好,对电脑配置没有要求,但是质量可能不稳定 - whisper: 生成
速度慢
,性能较差,对电脑配置有一定要求,但是质量更可靠
。
可以修改 config.toml
配置文件中的 subtitle_provider
进行切换
建议使用 edge
模式,如果生成的字幕质量不好,再切换到 whisper
模式
注意:
- whisper 模式下需要到 HuggingFace 下载一个模型文件,大约 3GB 左右,请确保网络通畅
- 如果留空,表示不生成字幕。
由于国内无法访问 HuggingFace,可以使用以下方法下载
whisper-large-v3
的模型文件
下载地址:
- 百度网盘: https://pan.baidu.com/s/11h3Q6tsDtjQKTjUu3sc5cA?pwd=xjs9
- 夸克网盘:https://pan.quark.cn/s/3ee3d991d64b
模型下载后解压,整个目录放到 .\MoneyPrinterTurbo\models
里面, 最终的文件路径应该是这样: .\MoneyPrinterTurbo\models\whisper-large-v3
MoneyPrinterTurbo
├─models
│ └─whisper-large-v3
│ config.json
│ model.bin
│ preprocessor_config.json
│ tokenizer.json
│ vocabulary.json
背景音乐 🎵
用于视频的背景音乐,位于项目的 resource/songs
目录下。
当前项目里面放了一些默认的音乐,来自于 YouTube 视频,如有侵权,请删除。
字幕字体 🅰
用于视频字幕的渲染,位于项目的 resource/fonts
目录下,你也可以放进去自己的字体。
常见问题 🤔
❓如何使用免费的OpenAI GPT-3.5模型?
OpenAI宣布ChatGPT里面3.5已经免费了,有开发者将其封装成了API,可以直接调用
确保你安装和启动了docker服务,执行以下命令启动docker服务
docker run -p 3040:3040 missuo/freegpt35
启动成功后,修改 config.toml
中的配置
llm_provider
设置为openai
openai_api_key
随便填写一个即可,比如 '123456'openai_base_url
改为http://localhost:3040/v1/
openai_model_name
改为gpt-3.5-turbo
注意:该方式稳定性较差
❓AttributeError: 'str' object has no attribute 'choices'`
这个问题是由于大模型没有返回正确的回复导致的。
大概率是网络原因, 使用 VPN,或者设置 openai_base_url
为你的代理 ,应该就可以解决了。
同时建议使用 Moonshot 或 DeepSeek 作为大模型提供商,这两个服务商在国内访问速度更快,更加稳定。
❓RuntimeError: No ffmpeg exe could be found
通常情况下,ffmpeg 会被自动下载,并且会被自动检测到。 但是如果你的环境有问题,无法自动下载,可能会遇到如下错误:
RuntimeError: No ffmpeg exe could be found.
Install ffmpeg on your system, or set the IMAGEIO_FFMPEG_EXE environment variable.
此时你可以从 https://www.gyan.dev/ffmpeg/builds/ 下载ffmpeg,解压后,设置 ffmpeg_path
为你的实际安装路径即可。
[app]
# 请根据你的实际路径设置,注意 Windows 路径分隔符为 \\
ffmpeg_path = "C:\\Users\\harry\\Downloads\\ffmpeg.exe"
❓ImageMagick的安全策略阻止了与临时文件@/tmp/tmpur5hyyto.txt相关的操作
可以在ImageMagick的配置文件policy.xml中找到这些策略。 这个文件通常位于 /etc/ImageMagick-X
/ 或 ImageMagick 安装目录的类似位置。 修改包含pattern="@"
的条目,将rights="none"
更改为rights="read|write"
以允许对文件的读写操作。
❓OSError: [Errno 24] Too many open files
这个问题是由于系统打开文件数限制导致的,可以通过修改系统的文件打开数限制来解决。
查看当前限制
ulimit -n
如果过低,可以调高一些,比如
ulimit -n 10240
❓Whisper 模型下载失败,出现如下错误
LocalEntryNotfoundEror: Cannot find an appropriate cached snapshotfolderfor the specified revision on the local disk and outgoing trafic has been disabled. To enablerepo look-ups and downloads online, pass 'local files only=False' as input.
或者
An error occured while synchronizing the model Systran/faster-whisper-large-v3 from the Hugging Face Hub: An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again. Trying to load the model directly from the local cache, if it exists.
解决方法:点击查看如何从网盘手动下载模型
反馈建议 📢
- 可以提交 issue 或者 pull request。
参考项目 📚
该项目基于 https://github.com/FujiwaraChoki/MoneyPrinter 重构而来,做了大量的优化,增加了更多的功能。 感谢原作者的开源精神。
许可证 📝
点击查看 LICENSE
文件
Star History
Automate the process of making money online.
MoneyPrinter V2
♥︎ If you're interested in using a hosted version, sign up for the waitlist on shiori.ai, an AI tool that combines all other AI tools into one.
𝕏 Also, follow me on X: @DevBySami.
Follow me on X.
An Application that automates the process of making money online. MPV2 (MoneyPrinter Version 2) is, as the name suggests, the second version of the MoneyPrinter project. It is a complete rewrite of the original project, with a focus on a wider range of features and a more modular architecture.
Note: MPV2 needs Python 3.9 to function effectively. Watch the YouTube video here
Features
- Twitter Bot (with CRON Jobs =>
scheduler
) - YouTube Shorts Automater (with CRON Jobs =>
scheduler
) - Affiliate Marketing (Amazon + Twitter)
- Find local businesses & cold outreach
Versions
MoneyPrinter has different versions for multiple languages developed by the community for the community. Here are some known versions:
- Chinese: MoneyPrinterTurbo
If you would like to submit your own version/fork of MoneyPrinter, please open an issue describing the changes you made to the fork.
Installation
Please install Microsoft Visual C++ build tools first, so that CoquiTTS can function correctly.
⚠️ If you are planning to reach out to scraped businesses per E-Mail, please first install the Go Programming Language.
git clone https://github.com/FujiwaraChoki/MoneyPrinterV2.git
cd MoneyPrinterV2
# Copy Example Configuration and fill out values in config.json
cp config.example.json config.json
# Create a virtual environment
python -m venv venv
# Activate the virtual environment - Windows
.\venv\Scripts\activate
# Activate the virtual environment - Unix
source venv/bin/activate
# Install the requirements
pip install -r requirements.txt
Usage
# Run the application
python src/main.py
Documentation
All relevant document can be found here.
Scripts
For easier usage, there are some scripts in the scripts
directory, that can be used to directly access the core functionality of MPV2, without the need of user interaction.
All scripts need to be run from the root directory of the project, e.g. bash scripts/upload_video.sh
.
Contributing
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us. Check out docs/Roadmap.md for a list of features that need to be implemented.
Code of Conduct
Please read CODE_OF_CONDUCT.md for details on our code of conduct, and the process for submitting pull requests to us.
License
MoneyPrinterV2 is licensed under Affero General Public License v3.0
. See LICENSE for more information.
Acknowledgments
Disclaimer
This project is for educational purposes only. The author will not be responsible for any misuse of the information provided. All the information on this website is published in good faith and for general information purpose only. The author does not make any warranties about the completeness, reliability, and accuracy of this information. Any action you take upon the information you find on this website (FujiwaraChoki/MoneyPrinterV2), is strictly at your own risk. The author will not be liable for any losses and/or damages in connection with the use of our website.
🪄 Create rich visualizations with AI
Data Formulator: Create Rich Visualizations with AI
Transform data and create rich visualizations iteratively with AI 🪄. Try Data Formulator now!

News 🔥🔥🔥
-
[02-20-2025] Data Formulator 0.1.6 released!
- Now supports working with multiple datasets at once! Tell Data Formulator which data tables you would like to use in the encoding shelf, and it will figure out how to join the tables to create a visualization to answer your question. 🪄
- Checkout the demo at [https://github.com/microsoft/data-formulator/releases/tag/0.1.6].
- Update your Data Formulator to the latest version to play with the new features.
-
[02-12-2025] More models supported now!
- Now supports OpenAI, Azure, Ollama, and Anthropic models (and more powered by LiteLLM);
- Models with strong code generation and instruction following capabilities are recommended (gpt-4o, claude-3-5-sonnet etc.);
- You can store API keys in
api-keys.env
to avoid typing them every time (see templateapi-keys.env.template
). - Let us know which models you have good/bad experiences with, and what models you would like to see supported! [comment here]
-
[11-07-2024] Minor fun update: data visualization challenges!
- We added a few visualization challenges with the sample datasets. Can you complete them all? [try them out!]
- Comment in the issue when you did, or share your results/questions with others! [comment here]
-
[10-11-2024] Data Formulator python package released!
- You can now install Data Formulator using Python and run it locally, easily. [check it out].
- Our Codespaces configuration is also updated for fast start up ⚡️. [try it now!]
- New experimental feature: load an image or a messy text, and ask AI to parse and clean it for you(!). [demo]
-
[10-01-2024] Initial release of Data Formulator, check out our [blog] and [video]!
Overview
Data Formulator is an application from Microsoft Research that uses large language models to transform data, expediting the practice of data visualization.
Data Formulator is an AI-powered tool for analysts to iteratively create rich visualizations. Unlike most chat-based AI tools where users need to describe everything in natural language, Data Formulator combines user interface interactions (UI) and natural language (NL) inputs for easier interaction. This blended approach makes it easier for users to describe their chart designs while delegating data transformation to AI.
Get Started
Play with Data Formulator with one of the following options:
-
Option 1: Install via Python PIP
Use Python PIP for an easy setup experience, running locally (recommend: install it in a virtual environment).
# install data_formulator pip install data_formulator # start data_formulator data_formulator # alternatively, you can run data formulator with this command python -m data_formulator
Data Formulator will be automatically opened in the browser at http://localhost:5000.
Update: you can specify the port number (e.g., 8080) by
python -m data_formulator --port 8080
if the default port is occupied. -
Option 2: Codespaces (5 minutes)
You can also run Data Formulator in Codespaces; we have everything pre-configured. For more details, see CODESPACES.md.
-
Option 3: Working in the developer mode
You can build Data Formulator locally if you prefer full control over your development environment and the ability to customize the setup to your specific needs. For detailed instructions, refer to DEVELOPMENT.md.
Using Data Formulator
Once you've completed the setup using either option, follow these steps to start using Data Formulator:
The basics of data visualization
- Provide OpenAI keys and select a model (GPT-4o suggested) and choose a dataset.
- Choose a chart type, and then drag-and-drop data fields to chart properties (x, y, color, ...) to specify visual encodings.
https://github.com/user-attachments/assets/0fbea012-1d2d-46c3-a923-b1fc5eb5e5b8
Create visualization beyond the initial dataset (powered by 🤖)
- You can type names of fields that do not exist in current data in the encoding shelf:
- this tells Data Formulator that you want to create visualizations that require computation or transformation from existing data,
- you can optionally provide a natural language prompt to explain and clarify your intent (not necessary when field names are self-explanatory).
- Click the Formulate button.
- Data Formulator will transform data and instantiate the visualization based on the encoding and prompt.
- Inspect the data, chart and code.
- To create a new chart based on existing ones, follow up in natural language:
- provide a follow up prompt (e.g., ``show only top 5!''),
- you may also update visual encodings for the new chart.
https://github.com/user-attachments/assets/160c69d2-f42d-435c-9ff3-b1229b5bddba
https://github.com/user-attachments/assets/c93b3e84-8ca8-49ae-80ea-f91ceef34acb
Repeat this process as needed to explore and understand your data. Your explorations are trackable in the Data Threads panel.
Developers' Guide
Follow the developers' instructions to build your new data analysis tools on top of Data Formulator.
Research Papers
@article{wang2024dataformulator2iteratively,
title={Data Formulator 2: Iteratively Creating Rich Visualizations with AI},
author={Chenglong Wang and Bongshin Lee and Steven Drucker and Dan Marshall and Jianfeng Gao},
year={2024},
booktitle={ArXiv preprint arXiv:2408.16119},
}
@article{wang2023data,
title={Data Formulator: AI-powered Concept-driven Visualization Authoring},
author={Wang, Chenglong and Thompson, John and Lee, Bongshin},
journal={IEEE Transactions on Visualization and Computer Graphics},
year={2023},
publisher={IEEE}
}
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
#1 Locally hosted web application that allows you to perform various operations on PDF files
Stirling-PDF
Stirling-PDF is a robust, locally hosted web-based PDF manipulation tool using Docker. It enables you to carry out various operations on PDF files, including splitting, merging, converting, reorganizing, adding images, rotating, compressing, and more. This locally hosted web application has evolved to encompass a comprehensive set of features, addressing all your PDF requirements.
All files and PDFs exist either exclusively on the client side, reside in server memory only during task execution, or temporarily reside in a file solely for the execution of the task. Any file downloaded by the user will have been deleted from the server by that point.
Homepage: https://stirlingpdf.com
All documentation available at https://docs.stirlingpdf.com/
Features
- 50+ PDF Operations
- Parallel file processing and downloads
- Dark mode support
- Custom download options
- Custom 'Pipelines' to run multiple features in a automated queue
- API for integration with external scripts
- Optional Login and Authentication support (see here for documentation)
- Database Backup and Import (see here for documentation)
- Enterprise features like SSO see here
PDF Features
Page Operations
- View and modify PDFs - View multi-page PDFs with custom viewing, sorting, and searching. Plus, on-page edit features like annotating, drawing, and adding text and images. (Using PDF.js with Joxit and Liberation fonts)
- Full interactive GUI for merging/splitting/rotating/moving PDFs and their pages
- Merge multiple PDFs into a single resultant file
- Split PDFs into multiple files at specified page numbers or extract all pages as individual files
- Reorganize PDF pages into different orders
- Rotate PDFs in 90-degree increments
- Remove pages
- Multi-page layout (format PDFs into a multi-paged page)
- Scale page contents size by set percentage
- Adjust contrast
- Crop PDF
- Auto-split PDF (with physically scanned page dividers)
- Extract page(s)
- Convert PDF to a single page
- Overlay PDFs on top of each other
- PDF to a single page
- Split PDF by sections
Conversion Operations
- Convert PDFs to and from images
- Convert any common file to PDF (using LibreOffice)
- Convert PDF to Word/PowerPoint/others (using LibreOffice)
- Convert HTML to PDF
- Convert PDF to XML
- Convert PDF to CSV
- URL to PDF
- Markdown to PDF
Security & Permissions
- Add and remove passwords
- Change/set PDF permissions
- Add watermark(s)
- Certify/sign PDFs
- Sanitize PDFs
- Auto-redact text
Other Operations
- Add/generate/write signatures
- Split by Size or PDF
- Repair PDFs
- Detect and remove blank pages
- Compare two PDFs and show differences in text
- Add images to PDFs
- Compress PDFs to decrease their filesize (using qpdf)
- Extract images from PDF
- Remove images from PDF
- Extract images from scans
- Remove annotations
- Add page numbers
- Auto-rename files by detecting PDF header text
- OCR on PDF (using Tesseract OCR)
- PDF/A conversion (using LibreOffice)
- Edit metadata
- Flatten PDFs
- Get all information on a PDF to view or export as JSON
- Show/detect embedded JavaScript
📖 Get Started
Visit our comprehensive documentation at docs.stirlingpdf.com for:
- Installation guides for all platforms
- Configuration options
- Feature documentation
- API reference
- Security setup
- Enterprise features
Supported Languages
Stirling-PDF currently supports 39 languages!
Language | Progress |
---|---|
Arabic (العربية) (ar_AR) | |
Azerbaijani (Azərbaycan Dili) (az_AZ) | |
Basque (Euskara) (eu_ES) | |
Bulgarian (Български) (bg_BG) | |
Catalan (Català) (ca_CA) | |
Croatian (Hrvatski) (hr_HR) | |
Czech (Česky) (cs_CZ) | |
Danish (Dansk) (da_DK) | |
Dutch (Nederlands) (nl_NL) | |
English (English) (en_GB) | |
English (US) (en_US) | |
French (Français) (fr_FR) | |
German (Deutsch) (de_DE) | |
Greek (Ελληνικά) (el_GR) | |
Hindi (हिंदी) (hi_IN) | |
Hungarian (Magyar) (hu_HU) | |
Indonesian (Bahasa Indonesia) (id_ID) | |
Irish (Gaeilge) (ga_IE) | |
Italian (Italiano) (it_IT) | |
Japanese (日本語) (ja_JP) | |
Korean (한국어) (ko_KR) | |
Norwegian (Norsk) (no_NB) | |
Persian (فارسی) (fa_IR) | |
Polish (Polski) (pl_PL) | |
Portuguese (Português) (pt_PT) | |
Portuguese Brazilian (Português) (pt_BR) | |
Romanian (Română) (ro_RO) | |
Russian (Русский) (ru_RU) | |
Serbian Latin alphabet (Srpski) (sr_LATN_RS) | |
Simplified Chinese (简体中文) (zh_CN) | |
Slovakian (Slovensky) (sk_SK) | |
Slovenian (Slovenščina) (sl_SI) | |
Spanish (Español) (es_ES) | |
Swedish (Svenska) (sv_SE) | |
Thai (ไทย) (th_TH) | |
Tibetan (བོད་ཡིག་) (zh_BO) | |
Traditional Chinese (繁體中文) (zh_TW) | |
Turkish (Türkçe) (tr_TR) | |
Ukrainian (Українська) (uk_UA) | |
Vietnamese (Tiếng Việt) (vi_VN) |
Stirling PDF Enterprise
Stirling PDF offers an Enterprise edition of its software. This is the same great software but with added features, support and comforts. Check out our Enterprise docs
🤝 Looking to contribute?
Join our community:
Visualize Ownership and Lifetimes in Rust
🦉
RustOwl
Visualize ownership and lifetimes in Rust for debugging and optimization
RustOwl visualizes ownership movement and lifetimes of variables. When you save Rust source code, it is analyzed, and the ownership and lifetimes of variables are visualized when you hover over a variable or function call.
RustOwl visualizes those by using underlines:
- 🟩 green: variable's actual lifetime
- 🟦 blue: immutable borrowing
- 🟪 purple: mutable borrowing
- 🟧 orange: value moved / function call
- 🟥 red: lifetime error - diff of lifetime between actual and expected
Currently, we offer VSCode extension, Neovim plugin and Emacs package. For these editors, move the text cursor over the variable or function call you want to inspect and wait for 2 seconds to visualize the information. We implemented LSP server cargo owlsp
with an extended protocol. So, RustOwl can be used easily from other editor.
Quick Start
Here we describe how to start using RustOwl with VSCode.
Prerequisite
curl
,rustup
andcargo
installed- Visual Studio Code (VSCode) installed
We tested this guide on macOS Sequoia 15.2 on arm64 architecture with VSCode 1.96.4 and rustup
1.27.1.
We also tested this guide on Ubuntu 25.04 on arm64 architecture with VSCode 1.96.4 and rustup
1.27.1. On Ubuntu, you need to run apt install build-essential
before installing.
After installation, the extension will automatically run RustOwl when you save any Rust program in cargo workspace. The initial analysis may take some time, but from the second run onward, compile caching is used to reduce the analysis time.
We tested on Windows 11 Education 23H2 on amd64 architecture. For Windows, please clone this repository and build RustOwl manually.
Install RustOwl
To install RustOwl command, run the command below.
curl -L "https://github.com/cordx56/rustowl/releases/download/v0.1.4/install.sh" | sh
VSCode
You can install VSCode extension from this link.
Also, you can download VSCode extension file ( .vsix
) from this link.
Other editor support
We support Neovim and Emacs. You can also create your own LSP client.
Neovim
Add to plugin manager:
{ "cordx56/rustowl", dependencies = { "neovim/nvim-lspconfig" } }
Configure example:
lspconfig.rustowl.setup {
trigger = {
hover = false,
},
}
keymap(
"n",
"<c-o>",
require("rustowl").rustowl_cursor,
{ noremap = true, silent = true }
)
Emacs
Elpaca example:
(elpaca
(rustowlsp
:host github
:repo "cordx56/rustowl"
:files (:defaults "emacs/*")))
RustRover / IntelliJ IDEs
There is a third-party repository that supports IntelliJ IDEs.
Build manually
Here, we describe manual install instructions from source code.
RustOwl
Prerequisite
rustup
andcargo
installed- You can install
rustup
from this link. - You need to set up the
PATH
environment variable. To do this, follow the instructions provided by therustup
installer. For example, in bash, runexport PATH=$HOME/.cargo/bin:$PATH
.
- You can install
RustOwl has been tested on macOS Sequoia 15.2 on arm64 architecture with rustup
1.27.1. We have not tested the installation of dependencies from other package repositories, such as Homebrew. You may need to uninstall any Rust-related packages installed through those repositories first. Other dependencies are locked in the configuration files and will be installed automatically.
We have also tested this on Ubuntu 25.04 on arm64 architecture with rustup
1.27.1. Additional dependencies may be required. We have confirmed that running apt install build-essential
is necessary on a freshly installed Ubuntu for linking.
Build & Run
cd rustowl
cargo install --path . --locked
cargo owlsp
VSCode extension
Prerequisite
- VSCode installed
- You can install VSCode from this link.
- Node.js installed
yarn
installed- After installing Node.js, You can install
yarn
by runningnpm install -g yarn
.
- After installing Node.js, You can install
VSCode extension has been tested on macOS Sequoia 15.2 on arm64 architecture with Visual Studio Code 1.96.4, Node.js v20.16.0, and yarn
1.22.22. Other dependencies are locked in the configuration files and will be installed automatically.
Build & Run
First, install the dependencies.
cd vscode
yarn install --frozen-lockfile
Then open vscode
directory in VSCode.
A notification to install the recommended VSCode extension will appear in the bottom right corner of VSCode. Click the install button, wait for the installation to finish, and then restart VSCode.
Open vscode
directory again, and press the F5
key in the VSCode window. A new VSCode window with the extension enabled will appear.
Open cargo workspace directory in the new VSCode window.
When you save Rust files, decoration indicating the movement of ownership and lifetimes will appear in the editor.
Note
In this tool, due to the limitations of VSCode's decoration specifications, characters with descenders, such as g or parentheses, may occasionally not display underlines properly. Additionally, we observed that the println!
macro sometimes produces extra output, though this does not affect usability in any significant way.
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

Unstract
Intelligent Document Processing 2.0 (IDP 2.0) Platform Powered by Large Language Models
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
🤖 Prompt Studio
Prompt Studio's primary reason for existence is so you can develop the necessary prompts for document data extraction super efficiently. It is a purpose-built environment that makes this not just easy for you—but, lot of fun! The document sample, its variants, the prompts you're developing, outputs from different LLMs, the schema you're developing, costing details of the extraction and various tools that let you measure the effectiveness of your prompts are just a click away and easily accessible. Prompt Studio is designed for effective and high speed development and iteration of prompts for document data extraction. Welcome to IDP 2.0!
🧘♀️ Three step nirvana with Workflow Studio
Automate critical business processes that involve complex documents with a human in the loop. Go beyond RPA with the power of Large Language Models.
🌟 Step 1: Add documents to no-code Prompt Studio and do prompt engineering to extract required fields
🌟 Step 2: Configure Prompt Studio project as API deployment or configure input source and output destination for ETL Pipeline
🌟 Step 3: Deploy Workflows as unstructured data APIs or unstructured data ETL Pipelines!
🚀 Getting started
System Requirements
- 8GB RAM (recommended)
Prerequisites
- Linux or MacOS (Intel or M-series)
- Docker
- Docker Compose (if you need to install it separately)
- Git
Next, either download a release or clone this repo and do the following:
✅ ./run-platform.sh
✅ Now visit http://frontend.unstract.localhost in your browser
✅ Use username and password unstract
to login
That's all there is to it!
Follow these steps to change the default username and password.
See user guide for more details on managing the platform.
Another really quick way to experience Unstract is by signing up for our hosted version. It comes with a 14 day free trial!
⏩ Quick Start Guide
Unstract comes well documented. You can get introduced to the basics of Unstract, and learn how to connect various systems like LLMs, Vector Databases, Embedding Models and Text Extractors to it. The easiest way to wet your feet is to go through our Quick Start Guide where you actually get to do some prompt engineering in Prompt Studio and launch an API to structure varied credit card statements!
🤝 Ecosystem support
LLM Providers
Provider | Status | |
---|---|---|
![]() |
OpenAI | ✅ Working |
![]() |
Google VertexAI, Gemini Pro | ✅ Working |
![]() |
Azure OpenAI | ✅ Working |
![]() |
Anthropic | ✅ Working |
![]() |
Ollama | ✅ Working |
![]() |
Bedrock | ✅ Working |
![]() |
Google PaLM | ✅ Working |
![]() |
Anyscale | ✅ Working |
![]() |
Mistral AI | ✅ Working |
![]() |
Replicate | 🗓️ Coming soon! |
Vector Databases
Provider | Status | |
---|---|---|
![]() |
Qdrant | ✅ Working |
![]() |
Weaviate | ✅ Working |
![]() |
Pinecone | ✅ Working |
![]() |
PostgreSQL | ✅ Working |
![]() |
Milvus | ✅ Working |
Embeddings
Provider | Status | |
---|---|---|
![]() |
OpenAI | ✅ Working |
![]() |
Azure OpenAI | ✅ Working |
![]() |
Google PaLM | ✅ Working |
![]() |
Ollama | ✅ Working |
Text Extractors
Provider | Status | |
---|---|---|
![]() |
Unstract LLMWhisperer | ✅ Working |
![]() |
Unstructured.io Community | ✅ Working |
![]() |
Unstructured.io Enterprise | ✅ Working |
![]() |
LlamaIndex Parse | ✅ Working |
ETL Sources
Provider | Status | |
---|---|---|
![]() |
AWS S3 | ✅ Working |
![]() |
MinIO | ✅ Working |
![]() |
Google Cloud Storage | ✅ Working |
![]() |
Azure Cloud Storage | ✅ Working |
![]() |
Google Drive | ✅ Working |
![]() |
Dropbox | ✅ Working |
![]() |
SFTP | ✅ Working |
![]() |
Box | 🗓️ Coming soon! |
![]() |
HTTP/HTTPS | 🗓️ Coming soon! |
ETL Destinations
Provider | Status | |
---|---|---|
![]() |
Snowflake | ✅ Working |
![]() |
Amazon Redshift | ✅ Working |
![]() |
Google BigQuery | ✅ Working |
![]() |
PostgreSQL | ✅ Working |
![]() |
MySQL | ✅ Working |
![]() |
MariaDB | ✅ Working |
![]() |
Microsoft SQL Server | ✅ Working |
🙌 Contributing
Contributions are welcome! Please see CONTRIBUTING.md for further details to get started easily.
👋 Join the LLM-powered automation community
- On Slack, join great conversations around LLMs, their ecosystem and leveraging them to automate the previously unautomatable!
- Follow us on X/Twitter
- Follow us on LinkedIn
🚨 Backup encryption key
Do copy the value of ENCRYPTION_KEY
config in either backend/.env
or platform-service/.env
file to a secure location.
Adapter credentials are encrypted by the platform using this key. Its loss or change will make all existing adapters inaccessible!
📊 A note on analytics
In full disclosure, Unstract integrates Posthog to track usage analytics. As you can inspect the relevant code here, we collect the minimum possible metrics. Posthog can be disabled if desired by setting REACT_APP_ENABLE_POSTHOG
to false
in the frontend's .env file.
Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel AI SDK! Search with models like Grok 2.0.
Scira
A minimalistic AI-powered search engine that helps you find information on the internet.
Powered By
- Vercel AI SDK - For AI model integration and streaming
- Tavily AI - For search grounding and web search capabilities
Special Thanks
Features
- AI-powered search: Get answers to your questions using Anthropic's Models.
- Web search: Search the web using Tavily's API.
- URL Specific search: Get information from a specific URL.
- Weather: Get the current weather for any location using OpenWeather's API.
- Programming: Run code snippets in multiple languages using E2B's API.
- Maps: Get the location of any place using Google Maps API, Mapbox API, and TripAdvisor API.
- YouTube Search: Search for videos on YouTube and get timestamps and transcripts powered by Exa.AI - the Web Search API.
- Academic Search: Search for academic papers powered by Exa.AI - the Web Search API.
- X Posts Search: Search for posts on X.com powered by Exa.AI - the Web Search API.
- Flight Tracker: Track flights using AviationStack's API.
- Trending Movies and TV Shows: Get information about trending movies and TV shows.
- Movie or TV Show Search: Get information about any movie or TV show.
LLM used
- xAI's Grok
- Anthropic's Claude 3.5 Sonnet
- Meta's Llama 3.3 70B by Cerebras
- Deepseek R1 Distill by Groq Inc
Built with
- Next.js
- Tailwind CSS
- Vercel AI SDK
- Shadcn/UI
- Exa.AI
- Tavily
- OpenWeather
- E2B
- Google Maps
- Mapbox
- TripAdvisor
- AviationStack
Deploy your own
Set Scira as your default search engine
-
Open the Chrome browser settings:
- Click on the three vertical dots in the upper right corner of the browser.
- Select "Settings" from the dropdown menu.
-
Go to the search engine settings:
- In the left sidebar, click on "Search engine."
- Then select "Manage search engines and site search."
-
Add a new search engine:
- Click on "Add" next to "Site search."
-
Set the search engine name:
- Enter
Scira
in the "Search engine" field.
- Enter
-
Set the search engine URL:
- Enter
https://scira.app?q=%s
in the "URL with %s in place of query" field.
- Enter
-
Set the search engine shortcut:
- Enter
sh
in the "Shortcut" field.
- Enter
-
Set Default:
- Click on the three dots next to the search engine you just added.
- Select "Make default" from the dropdown menu.
After completing these steps, you should be able to use Scira as your default search engine in Chrome.
Local development
Run via Docker
The application can be run using Docker in two ways:
Using Docker Compose (Recommended)
- Make sure you have Docker and Docker Compose installed on your system
- Create a
.env
file based on.env.example
with your API keys - Run the following command in the project root:
docker compose up
- The application will be available at
http://localhost:3000
Using Docker Directly
- Create a
.env
file based on.env.example
with your API keys - Build the Docker image:
docker build -t scira.app .
- Run the container:
docker run --env-file .env -p 3000:3000 scira.app
The application uses a multi-stage build process to minimize the final image size and implements security best practices. The production image runs on Node.js LTS with Alpine Linux for a minimal footprint.
Run with Node.js
To run the application locally without Docker:
- Sign up for accounts with the required AI providers:
- OpenAI (required)
- Anthropic (required)
- Tavily (required for web search feature)
- Copy
.env.example
to.env.local
and fill in your API keys - Install dependencies:
pnpm install
- Start the development server:
pnpm dev
- Open
http://localhost:3000
in your browser
License
This project is licensed under the MIT License - see the LICENSE file for details.
Python tool for converting files and office documents to Markdown.
MarkItDown
[!IMPORTANT] MarkItDown 0.0.2 alpha 1 (0.0.2a1) introduces a plugin-based architecture. As much as was possible, command-line and Python interfaces have remained the same as 0.0.1a3 to support backward compatibility. Please report any issues you encounter. Some interface changes may yet occur as we continue to refine MarkItDown to a first non-alpha release.
MarkItDown is a utility for converting various files to Markdown (e.g., for indexing, text analysis, etc). It supports:
- PowerPoint
- Word
- Excel
- Images (EXIF metadata and OCR)
- Audio (EXIF metadata and speech transcription)
- HTML
- Text-based formats (CSV, JSON, XML)
- ZIP files (iterates over contents)
- ... and more!
To install MarkItDown, use pip: pip install markitdown
. Alternatively, you can install it from the source:
git clone [email protected]:microsoft/markitdown.git
cd markitdown
pip install -e packages/markitdown
Usage
Command-Line
markitdown path-to-file.pdf > document.md
Or use -o
to specify the output file:
markitdown path-to-file.pdf -o document.md
You can also pipe content:
cat path-to-file.pdf | markitdown
Plugins
MarkItDown also supports 3rd-party plugins. Plugins are disabled by default. To list installed plugins:
markitdown --list-plugins
To enable plugins use:
markitdown --use-plugins path-to-file.pdf
To find available plugins, search GitHub for the hashtag #markitdown-plugin
. To develop a plugin, see packages/markitdown-sample-plugin
.
Azure Document Intelligence
To use Microsoft Document Intelligence for conversion:
markitdown path-to-file.pdf -o document.md -d -e "<document_intelligence_endpoint>"
More information about how to set up an Azure Document Intelligence Resource can be found here
Python API
Basic usage in Python:
from markitdown import MarkItDown
md = MarkItDown(enable_plugins=False) # Set to True to enable plugins
result = md.convert("test.xlsx")
print(result.text_content)
Document Intelligence conversion in Python:
from markitdown import MarkItDown
md = MarkItDown(docintel_endpoint="<document_intelligence_endpoint>")
result = md.convert("test.pdf")
print(result.text_content)
To use Large Language Models for image descriptions, provide llm_client
and llm_model
:
from markitdown import MarkItDown
from openai import OpenAI
client = OpenAI()
md = MarkItDown(llm_client=client, llm_model="gpt-4o")
result = md.convert("example.jpg")
print(result.text_content)
Docker
docker build -t markitdown:latest .
docker run --rm -i markitdown:latest < ~/your-file.pdf > output.md
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.
How to Contribute
You can help by looking at issues or helping review PRs. Any issue or PR is welcome, but we have also marked some as 'open for contribution' and 'open for reviewing' to help facilitate community contributions. These are ofcourse just suggestions and you are welcome to contribute in any way you like.
All | Especially Needs Help from Community | |
---|---|---|
Issues | All Issues | Issues open for contribution |
PRs | All PRs | PRs open for reviewing |
Running Tests and Checks
-
Navigate to the MarkItDown package:
cd packages/markitdown
-
Install
hatch
in your environment and run tests:pip install hatch # Other ways of installing hatch: https://hatch.pypa.io/dev/install/ hatch shell hatch test
(Alternative) Use the Devcontainer which has all the dependencies installed:
# Reopen the project in Devcontainer and run: hatch test
-
Run pre-commit checks before submitting a PR:
pre-commit run --all-files
Contributing 3rd-party Plugins
You can also contribute by creating and sharing 3rd party plugins. See packages/markitdown-sample-plugin
for more details.
Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
English | 简体中文 | 繁体中文 | 日本語 | 한국어 | Bahasa Indonesia | Português (Brasil)
Document | Roadmap | Twitter | Discord | Demo
📕 Table of Contents
- 💡 What is RAGFlow?
- 🎮 Demo
- 📌 Latest Updates
- 🌟 Key Features
- 🔎 System Architecture
- 🎬 Get Started
- 🔧 Configurations
- 🔧 Build a docker image without embedding models
- 🔧 Build a docker image including embedding models
- 🔨 Launch service from source for development
- 📚 Documentation
- 📜 Roadmap
- 🏄 Community
- 🙌 Contributing
💡 What is RAGFlow?
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
🎮 Demo
Try our demo at https://demo.ragflow.io.
🔥 Latest Updates
- 2025-02-05 Updates the model list of 'SILICONFLOW' and adds support for Deepseek-R1/DeepSeek-V3.
- 2025-01-26 Optimizes knowledge graph extraction and application, offering various configuration options.
- 2024-12-18 Upgrades Document Layout Analysis model in Deepdoc.
- 2024-12-04 Adds support for pagerank score in knowledge base.
- 2024-11-22 Adds more variables to Agent.
- 2024-11-01 Adds keyword extraction and related question generation to the parsed chunks to improve the accuracy of retrieval.
- 2024-08-22 Support text to SQL statements through RAG.
🎉 Stay Tuned
⭐️ Star our repository to stay up-to-date with exciting new features and improvements! Get instant notifications for new releases! 🌟
🌟 Key Features
🍭 "Quality in, quality out"
- Deep document understanding-based knowledge extraction from unstructured data with complicated formats.
- Finds "needle in a data haystack" of literally unlimited tokens.
🍱 Template-based chunking
- Intelligent and explainable.
- Plenty of template options to choose from.
🌱 Grounded citations with reduced hallucinations
- Visualization of text chunking to allow human intervention.
- Quick view of the key references and traceable citations to support grounded answers.
🍔 Compatibility with heterogeneous data sources
- Supports Word, slides, excel, txt, images, scanned copies, structured data, web pages, and more.
🛀 Automated and effortless RAG workflow
- Streamlined RAG orchestration catered to both personal and large businesses.
- Configurable LLMs as well as embedding models.
- Multiple recall paired with fused re-ranking.
- Intuitive APIs for seamless integration with business.
🔎 System Architecture
🎬 Get Started
📝 Prerequisites
- CPU >= 4 cores
- RAM >= 16 GB
- Disk >= 50 GB
- Docker >= 24.0.0 & Docker Compose >= v2.26.1
If you have not installed Docker on your local machine (Windows, Mac, or Linux), see Install Docker Engine.
🚀 Start up the server
-
Ensure
vm.max_map_count
>= 262144:To check the value of
vm.max_map_count
:$ sysctl vm.max_map_count
Reset
vm.max_map_count
to a value at least 262144 if it is not.# In this case, we set it to 262144: $ sudo sysctl -w vm.max_map_count=262144
This change will be reset after a system reboot. To ensure your change remains permanent, add or update the
vm.max_map_count
value in /etc/sysctl.conf accordingly:vm.max_map_count=262144
-
Clone the repo:
$ git clone https://github.com/infiniflow/ragflow.git
-
Start up the server using the pre-built Docker images:
The command below downloads the
v0.16.0-slim
edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different fromv0.16.0-slim
, update theRAGFLOW_IMAGE
variable accordingly in docker/.env before usingdocker compose
to start the server. For example: setRAGFLOW_IMAGE=infiniflow/ragflow:v0.16.0
for the full editionv0.16.0
.$ cd ragflow/docker $ docker compose -f docker-compose.yml up -d
RAGFlow image tag Image size (GB) Has embedding models? Stable? v0.16.0 ≈9 ✔ Stable release v0.16.0-slim ≈2 ❌ Stable release nightly ≈9 ✔ Unstable nightly build nightly-slim ≈2 ❌ Unstable nightly build -
Check the server status after having the server up and running:
$ docker logs -f ragflow-server
The following output confirms a successful launch of the system:
____ ___ ______ ______ __ / __ \ / | / ____// ____// /____ _ __ / /_/ // /| | / / __ / /_ / // __ \| | /| / / / _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ / /_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/ * Running on all addresses (0.0.0.0) * Running on http://127.0.0.1:9380 * Running on http://x.x.x.x:9380 INFO:werkzeug:Press CTRL+C to quit
If you skip this confirmation step and directly log in to RAGFlow, your browser may prompt a
network anormal
error because, at that moment, your RAGFlow may not be fully initialized. -
In your web browser, enter the IP address of your server and log in to RAGFlow.
With the default settings, you only need to enter
http://IP_OF_YOUR_MACHINE
(sans port number) as the default HTTP serving port80
can be omitted when using the default configurations. -
In service_conf.yaml.template, select the desired LLM factory in
user_default_llm
and update theAPI_KEY
field with the corresponding API key.See llm_api_key_setup for more information.
The show is on!
🔧 Configurations
When it comes to system configurations, you will need to manage the following files:
- .env: Keeps the fundamental setups for the system, such as
SVR_HTTP_PORT
,MYSQL_PASSWORD
, andMINIO_PASSWORD
. - service_conf.yaml.template: Configures the back-end services. The environment variables in this file will be automatically populated when the Docker container starts. Any environment variables set within the Docker container will be available for use, allowing you to customize service behavior based on the deployment environment.
- docker-compose.yml: The system relies on docker-compose.yml to start up.
The ./docker/README file provides a detailed description of the environment settings and service configurations which can be used as
${ENV_VARS}
in the service_conf.yaml.template file.
To update the default HTTP serving port (80), go to docker-compose.yml and change 80:80
to <YOUR_SERVING_PORT>:80
.
Updates to the above configurations require a reboot of all containers to take effect:
$ docker compose -f docker-compose.yml up -d
Switch doc engine from Elasticsearch to Infinity
RAGFlow uses Elasticsearch by default for storing full text and vectors. To switch to Infinity, follow these steps:
-
Stop all running containers:
$ docker compose -f docker/docker-compose.yml down -v
Note:
-v
will delete the docker container volumes, and the existing data will be cleared. -
Set
DOC_ENGINE
in docker/.env toinfinity
. -
Start the containers:
$ docker compose -f docker-compose.yml up -d
[!WARNING] Switching to Infinity on a Linux/arm64 machine is not yet officially supported.
🔧 Build a Docker image without embedding models
This image is approximately 2 GB in size and relies on external LLM and embedding services.
git clone https://github.com/infiniflow/ragflow.git
cd ragflow/
docker build --build-arg LIGHTEN=1 -f Dockerfile -t infiniflow/ragflow:nightly-slim .
🔧 Build a Docker image including embedding models
This image is approximately 9 GB in size. As it includes embedding models, it relies on external LLM services only.
git clone https://github.com/infiniflow/ragflow.git
cd ragflow/
docker build -f Dockerfile -t infiniflow/ragflow:nightly .
🔨 Launch service from source for development
-
Install uv, or skip this step if it is already installed:
pipx install uv
-
Clone the source code and install Python dependencies:
git clone https://github.com/infiniflow/ragflow.git cd ragflow/ uv sync --python 3.10 --all-extras # install RAGFlow dependent python modules
-
Launch the dependent services (MinIO, Elasticsearch, Redis, and MySQL) using Docker Compose:
docker compose -f docker/docker-compose-base.yml up -d
Add the following line to
/etc/hosts
to resolve all hosts specified in docker/.env to127.0.0.1
:127.0.0.1 es01 infinity mysql minio redis
-
If you cannot access HuggingFace, set the
HF_ENDPOINT
environment variable to use a mirror site:export HF_ENDPOINT=https://hf-mirror.com
-
Launch backend service:
source .venv/bin/activate export PYTHONPATH=$(pwd) bash docker/launch_backend_service.sh
-
Install frontend dependencies:
cd web npm install
-
Launch frontend service:
npm run dev
The following output confirms a successful launch of the system:
📚 Documentation
📜 Roadmap
See the RAGFlow Roadmap 2025
🏄 Community
🙌 Contributing
RAGFlow flourishes via open-source collaboration. In this spirit, we embrace diverse contributions from the community. If you would like to be a part, review our Contribution Guidelines first.
Build your own AI friend
小智 AI 聊天机器人 (XiaoZhi AI Chatbot)
这是虾哥的第一个硬件作品。
👉 ESP32+SenseVoice+Qwen72B打造你的AI聊天伴侣!【bilibili】
👉 给小智装上 DeepSeek 的聪明大脑【bilibili】
👉 手工打造你的 AI 女友,新手入门教程【bilibili】
项目目的
本项目是一个开源项目,以 MIT 许可证发布,允许任何人免费使用,并可以用于商业用途。
我们希望通过这个项目,能够帮助更多人入门 AI 硬件开发,了解如何将当下飞速发展的大语言模型应用到实际的硬件设备中。无论你是对 AI 感兴趣的学生,还是想要探索新技术的开发者,都可以通过这个项目获得宝贵的学习经验。
欢迎所有人参与到项目的开发和改进中来。如果你有任何想法或建议,请随时提出 Issue 或加入群聊。
学习交流 QQ 群:946599635
已实现功能
- Wi-Fi / ML307 Cat.1 4G
- BOOT 键唤醒和打断,支持点击和长按两种触发方式
- 离线语音唤醒 ESP-SR
- 流式语音对话(WebSocket 或 UDP 协议)
- 支持国语、粤语、英语、日语、韩语 5 种语言识别 SenseVoice
- 声纹识别,识别是谁在喊 AI 的名字 3D Speaker
- 大模型 TTS(火山引擎 或 CosyVoice)
- 大模型 LLM(Qwen, DeepSeek, Doubao)
- 可配置的提示词和音色(自定义角色)
- 短期记忆,每轮对话后自我总结
- OLED / LCD 显示屏,显示信号强弱或对话内容
- 支持 LCD 显示图片表情
- 支持多语言(中文、英文)
硬件部分
面包板手工制作实践
详见飞书文档教程:
面包板效果图如下:
已支持的开源硬件
- 立创·实战派 ESP32-S3 开发板
- 乐鑫 ESP32-S3-BOX3
- M5Stack CoreS3
- AtomS3R + Echo Base
- AtomMatrix + Echo Base
- 神奇按钮 2.4
- 微雪电子 ESP32-S3-Touch-AMOLED-1.8
- LILYGO T-Circle-S3
- 虾哥 Mini C3
- Moji 小智AI衍生版
- 无名科技Nologo-星智-1.54TFT
- 无名科技Nologo-星智-0.96TFT
固件部分
免开发环境烧录
新手第一次操作建议先不要搭建开发环境,直接使用免开发环境烧录的固件。
固件默认接入 xiaozhi.me 官方服务器,目前个人用户注册账号可以免费使用 Qwen 实时模型。
开发环境
- Cursor 或 VSCode
- 安装 ESP-IDF 插件,选择 SDK 版本 5.3 或以上
- Linux 比 Windows 更好,编译速度快,也免去驱动问题的困扰
- 使用 Google C++ 代码风格,提交代码时请确保符合规范
智能体配置
如果你已经拥有一个小智 AI 聊天机器人设备,可以登录 xiaozhi.me 控制台进行配置。
技术原理与私有化部署
在个人电脑上部署服务器,可以参考另一位作者同样以 MIT 许可证开源的项目 xiaozhi-esp32-server
Star History
A free + OSS logo generator powered by Flux on Together AI

AI Logo Generator
An open source logo generator – create professional logos in seconds with customizable styles.
Tech stack
- Flux Pro 1.1 on Together AI for logo generation
- Next.js with TypeScript for the app framework
- Shadcn for UI components & Tailwind for styling
- Upstash Redis for rate limiting
- Clerk for authentication
- Plausible & Helicone for analytics & observability
Cloning & running
- Clone the repo:
git clone https://github.com/Nutlope/logocreator
- Create a
.env
file and add your Together AI API key:TOGETHER_API_KEY=
- Run
npm install
andnpm run dev
to install dependencies and run locally.
Future Tasks
- Create a dashboard with a user's logo history
- Support SVG exports instead of just PNG
- Add support for additional styles
- Add a dropdown for image size (can do up to 1440x1440)
- Show approximate price when using your own Together AI key
- Allow the ability to upload a reference logo (use vision model to read it)
- Redesign popular brand’s logos with my logo maker and have it in a showcase
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
MetaGPT: The Multi-Agent Framework
Assign different roles to GPTs to form a collaborative entity for complex tasks.
News
🚀 Feb. 19, 2025: Today we are officially launching our natural language programming product: MGX (MetaGPT X) - the world's first AI agent development team. Offical website Twitter
🚀 Jan. 22, 2025: Our paper AFlow: Automating Agentic Workflow Generation accepted for oral presentation (top 1.8%) at ICLR 2025, ranking #2 in the LLM-based Agent category.
🚀 Oct. 29, 2024: We introduced three papers: AFLOW, FACT, and SELA, check the code!
🚀 Mar. 29, 2024: v0.8.0 released. Now you can use Data Interpreter (arxiv, example, code) via pypi package import. Meanwhile, we integrated the RAG module and supported multiple new LLMs.
🚀 Feb. 08, 2024: v0.7.0 released, supporting assigning different LLMs to different Roles. We also introduced Data Interpreter, a powerful agent capable of solving a wide range of real-world problems.
🚀 Jan. 16, 2024: Our paper MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework accepted for oral presentation (top 1.2%) at ICLR 2024, ranking #1 in the LLM-based Agent category.
🚀 Jan. 03, 2024: v0.6.0 released, new features include serialization, upgraded OpenAI package and supported multiple LLM, provided minimal example for debate etc.
🚀 Dec. 15, 2023: v0.5.0 released, introducing some experimental features such as incremental development, multilingual, multiple programming languages, etc.
🔥 Nov. 08, 2023: MetaGPT is selected into Open100: Top 100 Open Source achievements.
🔥 Sep. 01, 2023: MetaGPT tops GitHub Trending Monthly for the 17th time in August 2023.
🌟 Jun. 30, 2023: MetaGPT is now open source.
🌟 Apr. 24, 2023: First line of MetaGPT code committed.
Software Company as Multi-Agent System
- MetaGPT takes a one line requirement as input and outputs user stories / competitive analysis / requirements / data structures / APIs / documents, etc.
- Internally, MetaGPT includes product managers / architects / project managers / engineers. It provides the entire process of a software company along with carefully orchestrated SOPs.
Code = SOP(Team)
is the core philosophy. We materialize SOP and apply it to teams composed of LLMs.
Software Company Multi-Agent Schematic (Gradually Implementing)
Get Started
Installation
Ensure that Python 3.9 or later, but less than 3.12, is installed on your system. You can check this by using:
python --version
.
You can use conda like this:conda create -n metagpt python=3.9 && conda activate metagpt
pip install --upgrade metagpt
# or `pip install --upgrade git+https://github.com/geekan/MetaGPT.git`
# or `git clone https://github.com/geekan/MetaGPT && cd MetaGPT && pip install --upgrade -e .`
For detailed installation guidance, please refer to cli_install or docker_install
Configuration
You can init the config of MetaGPT by running the following command, or manually create ~/.metagpt/config2.yaml
file:
# Check https://docs.deepwisdom.ai/main/en/guide/get_started/configuration.html for more details
metagpt --init-config # it will create ~/.metagpt/config2.yaml, just modify it to your needs
You can configure ~/.metagpt/config2.yaml
according to the example and doc:
llm:
api_type: "openai" # or azure / ollama / groq etc. Check LLMType for more options
model: "gpt-4-turbo" # or gpt-3.5-turbo
base_url: "https://api.openai.com/v1" # or forward url / other llm url
api_key: "YOUR_API_KEY"
Usage
After installation, you can use MetaGPT at CLI
metagpt "Create a 2048 game" # this will create a repo in ./workspace
or use it as library
from metagpt.software_company import generate_repo, ProjectRepo
repo: ProjectRepo = generate_repo("Create a 2048 game") # or ProjectRepo("<path>")
print(repo) # it will print the repo structure with files
You can also use Data Interpreter to write code:
import asyncio
from metagpt.roles.di.data_interpreter import DataInterpreter
async def main():
di = DataInterpreter()
await di.run("Run data analysis on sklearn Iris dataset, include a plot")
asyncio.run(main()) # or await main() in a jupyter notebook setting
QuickStart & Demo Video
- Try it on MetaGPT Huggingface Space
- Matthew Berman: How To Install MetaGPT - Build A Startup With One Prompt!!
- Official Demo Video
https://github.com/geekan/MetaGPT/assets/34952977/34345016-5d13-489d-b9f9-b82ace413419
Tutorial
- 🗒 Online Document
- 💻 Usage
- 🔎 What can MetaGPT do?
- 🛠 How to build your own agents?
- 🧑💻 Contribution
- 🔖 Use Cases
- ❓ FAQs
Support
Discord Join US
📢 Join Our Discord Channel! Looking forward to seeing you there! 🎉
Contributor form
📝 Fill out the form to become a contributor. We are looking forward to your participation!
Contact Information
If you have any questions or feedback about this project, please feel free to contact us. We highly appreciate your suggestions!
- Email: [email protected]
- GitHub Issues: For more technical inquiries, you can also create a new issue in our GitHub repository.
We will respond to all questions within 2-3 business days.
Citation
To stay updated with the latest research and development, follow @MetaGPT_ on Twitter.
To cite MetaGPT or Data Interpreter in publications, please use the following BibTeX entries.
@inproceedings{hong2024metagpt,
title={Meta{GPT}: Meta Programming for A Multi-Agent Collaborative Framework},
author={Sirui Hong and Mingchen Zhuge and Jonathan Chen and Xiawu Zheng and Yuheng Cheng and Jinlin Wang and Ceyao Zhang and Zili Wang and Steven Ka Shing Yau and Zijuan Lin and Liyang Zhou and Chenyu Ran and Lingfeng Xiao and Chenglin Wu and J{\"u}rgen Schmidhuber},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=VtmBAGCN7o}
}
@misc{teng2025atom,
title={Atom of Thoughts for Markov LLM Test-Time Scaling},
author={Fengwei Teng and Zhaoyang Yu and Quan Shi and Jiayi Zhang and Chenglin Wu and Yuyu Luo},
year={2025},
eprint={2502.12018},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.12018},
}
@misc{xiang2025self,
title={Self-Supervised Prompt Optimization},
author={Jinyu Xiang and Jiayi Zhang and Zhaoyang Yu and Fengwei Teng and Jinhao Tu and Xinbing Liang and Sirui Hong and Chenglin Wu and Yuyu Luo},
year={2025},
eprint={2502.06855},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.06855},
}
@inproceedings{wang2025fact,
title={FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval},
author={Jinlin Wang and Suyuchen Wang and Ziwen Xia and Sirui Hong and Yun Zhu and Bang Liu and Chenglin Wu},
booktitle={The 2025 Annual Conference of the Nations of the Americas Chapter of the ACL},
year={2025},
url={https://openreview.net/forum?id=VXOircx5h3}
}
@misc{chi2024sela,
title={SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning},
author={Yizhou Chi and Yizhang Lin and Sirui Hong and Duyi Pan and Yaying Fei and Guanghao Mei and Bangbang Liu and Tianqi Pang and Jacky Kwok and Ceyao Zhang and Bang Liu and Chenglin Wu},
year={2024},
eprint={2410.17238},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2410.17238},
}
@inproceedings{zhang2025aflow,
title={{AF}low: Automating Agentic Workflow Generation},
author={Jiayi Zhang and Jinyu Xiang and Zhaoyang Yu and Fengwei Teng and Xiong-Hui Chen and Jiaqi Chen and Mingchen Zhuge and Xin Cheng and Sirui Hong and Jinlin Wang and Bingnan Zheng and Bang Liu and Yuyu Luo and Chenglin Wu},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=z5uVAKwmjf}
}
@misc{hong2024data,
title={Data Interpreter: An LLM Agent For Data Science},
author={Sirui Hong and Yizhang Lin and Bang Liu and Bangbang Liu and Binhao Wu and Danyang Li and Jiaqi Chen and Jiayi Zhang and Jinlin Wang and Li Zhang and Lingyao Zhang and Min Yang and Mingchen Zhuge and Taicheng Guo and Tuo Zhou and Wei Tao and Wenyi Wang and Xiangru Tang and Xiangtao Lu and Xiawu Zheng and Xinbing Liang and Yaying Fei and Yuheng Cheng and Zongze Xu and Chenglin Wu},
year={2024},
eprint={2402.18679},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2402.18679},
}
Open source software that helps you create and deploy high-frequency crypto trading bots
Hummingbot is an open-source framework that helps you design and deploy automated trading strategies, or bots, that can run on many centralized or decentralized exchanges. Over the past year, Hummingbot users have generated over $34 billion in trading volume across 140+ unique trading venues.
The Hummingbot codebase is free and publicly available under the Apache 2.0 open-source license. Our mission is to democratize high-frequency trading by creating a global community of algorithmic traders and developers that share knowledge and contribute to the codebase.
Quick Links
- Website and Docs: Official Hummingbot website and documentation
- Installation: Install Hummingbot on various platforms
- Discord: The main gathering spot for the global Hummingbot community
- YouTube: Videos that teach you how to get the most of of Hummingbot
- Twitter: Get the latest announcements about Hummingbot
- Reported Volumes: Reported trading volumes across all Hummingbot instances
- Newsletter: Get our newsletter whenever we ship a new release
Exchange Connectors
Hummingbot connectors standardize REST and WebSocket API interfaces to different types of exchanges, enabling you to build sophisticated trading strategies that can be deployed across many exchanges with minimal changes. We classify exchanges into the following categories:
- CEX: Centralized exchanges that take custody of your funds. Use API keys to connect with Hummingbot.
- DEX: Decentralized, non-custodial exchanges that operate on a blockchain. Use wallet keys to connect with Hummingbot.
In addition, connectors differ based on the type of market supported:
- CLOB Spot: Connectors to spot markets on central limit order book (CLOB) exchanges
- CLOB Perp: Connectors to perpetual futures markets on CLOB exchanges
- AMM: Connectors to spot markets on Automatic Market Maker (AMM) decentralized exchanges
Exchange Sponsors
We are grateful for the following exchanges that support the development and maintenance of Hummingbot via broker partnerships and sponsorships.
Connector ID | Exchange | CEX/DEX | Market Type | Docs | Discount |
---|---|---|---|---|---|
binance |
Binance | CEX | CLOB Spot | Docs | |
binance_perpetual |
Binance | CEX | CLOB Perp | Docs | |
gate_io |
Gate.io | CEX | CLOB Spot | Docs | |
gate_io_perpetual |
Gate.io | CEX | CLOB Perp | Docs | |
htx |
HTX (Huobi) | CEX | CLOB Spot | Docs | |
kucoin |
KuCoin | CEX | CLOB Spot | Docs | |
kucoin_perpetual |
KuCoin | CEX | CLOB Perp | Docs | |
okx |
OKX | CEX | CLOB Spot | Docs | |
okx_perpetual |
OKX | CEX | CLOB Perp | Docs | |
dydx_v4_perpetual |
dYdX | DEX | CLOB Perp | Docs | - |
hyperliquid_perpetual |
Hyperliquid | DEX | CLOB Perp | Docs | - |
xrpl |
XRP Ledger | DEX | CLOB Spot | Docs | - |
Other Exchange Connectors
Currently, the master branch of Hummingbot also includes the following exchange connectors, which are maintained and updated through the Hummingbot Foundation governance process. See Governance for more information.
Connector ID | Exchange | CEX/DEX | Type | Docs | Discount |
---|---|---|---|---|---|
ascend_ex |
AscendEx | CEX | CLOB Spot | Docs | - |
balancer |
Balancer | DEX | AMM | Docs | - |
bitget_perpetual |
Bitget | CEX | CLOB Perp | Docs | - |
bitmart |
BitMart | CEX | CLOB Spot | Docs | - |
bitrue |
Bitrue | CEX | CLOB Spot | Docs | - |
bitstamp |
Bitstamp | CEX | CLOB Spot | Docs | - |
btc_markets |
BTC Markets | CEX | CLOB Spot | Docs | - |
bybit |
Bybit | CEX | CLOB Spot | Docs | - |
bybit_perpetual |
Bybit | CEX | CLOB Perp | Docs | - |
carbon |
Carbon | DEX | AMM | Docs | - |
coinbase_advanced_trade |
Coinbase | CEX | CLOB Spot | Docs | - |
cube |
Cube | CEX | CLOB Spot | Docs | - |
curve |
Curve | DEX | AMM | Docs | - |
dexalot |
Dexalot | DEX | CLOB Spot | Docs | - |
hashkey |
HashKey | CEX | CLOB Spot | Docs | - |
hashkey_perpetual |
HashKey | CEX | CLOB Perp | Docs | - |
injective_v2 |
Injective Helix | DEX | CLOB Spot | Docs | - |
injective_v2_perpetual |
Injective Helix | DEX | CLOB Perp | Docs | - |
kraken |
Kraken | CEX | CLOB Spot | Docs | - |
mad_meerkat |
Mad Meerkat | DEX | AMM | Docs | - |
mexc |
MEXC | CEX | CLOB Spot | Docs | - |
ndax |
NDAX | CEX | CLOB Spot | Docs | - |
openocean |
OpenOcean | DEX | AMM | Docs | - |
pancakeswap |
PancakeSwap | DEX | AMM | Docs | - |
pangolin |
Pangolin | CEX | DEX | Docs | - |
polkadex |
Polkadex | DEX | CLOB Spot | Docs | - |
quickswap |
QuickSwap | DEX | AMM | Docs | - |
sushiswap |
SushiSwap | DEX | AMM | Docs | - |
tinyman |
Tinyman | DEX | AMM | Docs | - |
traderjoe |
Trader Joe | DEX | AMM | Docs | - |
uniswap |
Uniswap | DEX | AMM | Docs | - |
vertex |
Vertex | DEX | CLOB Spot | Docs | - |
vvs |
VVS | DEX | AMM | Docs | - |
xsswap |
XSSwap | DEX | AMM | Docs | - |
Other Hummingbot Repos
- Deploy: Deploy Hummingbot in various configurations with Docker
- Dashboard: Web app that help you create, backtest, deploy, and manage Hummingbot instances
- Quants Lab: Juypter notebooks that enable you to fetch data and perform research using Hummingbot
- Gateway: Typescript based API client for DEX connectors
- Hummingbot Site: Official documentation for Hummingbot - we welcome contributions here too!
Contributions
The Hummingbot architecture features modular components that can be maintained and extended by individual community members.
We welcome contributions from the community! Please review these guidelines before submitting a pull request.
To have your exchange connector or other pull request merged into the codebase, please submit a New Connector Proposal or Pull Request Proposal, following these guidelines. Note that you will need some amount of HBOT tokens in your Ethereum wallet to submit a proposal.
Legal
- License: Hummingbot is open source and licensed under Apache 2.0.
- Data collection: See Reporting for information on anonymous data collection and reporting in Hummingbot.
Docmost is an open-source collaborative wiki and documentation software. It is an open-source alternative to Confluence and Notion.
Docmost
Open-source collaborative wiki and documentation software.
Website | Documentation
[!NOTE]
Docmost is currently in beta. We value your feedback as we progress towards a stable release.
Getting started
To get started with Docmost, please refer to our documentation.
Features
- Real-time collaboration
- Diagrams (Draw.io, Excalidraw and Mermaid)
- Spaces
- Permissions management
- Groups
- Comments
- Page history
- Search
- File attachment
Screenshots
Contributing
See the development documentation
A simple screen parsing tool towards pure vision based GUI agent
OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent
📢 [Project Page] [V2 Blog Post] [Models V2] [Models V1.5] [HuggingFace Space Demo]
OmniParser is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements, which significantly enhances the ability of GPT-4V to generate actions that can be accurately grounded in the corresponding regions of the interface.
News
- [2025/2] We release OmniParser V2 checkpoints. Watch Video
- [2025/2] We introduce OmniTool: Control a Windows 11 VM with OmniParser + your vision model of choice. OmniTool supports out of the box the following large language models - OpenAI (4o/o1/o3-mini), DeepSeek (R1), Qwen (2.5VL) or Anthropic Computer Use. Watch Video
- [2025/1] V2 is coming. We achieve new state of the art results 39.5% on the new grounding benchmark Screen Spot Pro with OmniParser v2 (will be released soon)! Read more details here.
- [2024/11] We release an updated version, OmniParser V1.5 which features 1) more fine grained/small icon detection, 2) prediction of whether each screen element is interactable or not. Examples in the demo.ipynb.
- [2024/10] OmniParser was the #1 trending model on huggingface model hub (starting 10/29/2024).
- [2024/10] Feel free to checkout our demo on huggingface space! (stay tuned for OmniParser + Claude Computer Use)
- [2024/10] Both Interactive Region Detection Model and Icon functional description model are released! Hugginface models
- [2024/09] OmniParser achieves the best performance on Windows Agent Arena!
Install
First clone the repo, and then install environment:
cd OmniParser
conda create -n "omni" python==3.12
conda activate omni
pip install -r requirements.txt
Ensure you have the V2 weights downloaded in weights folder (ensure caption weights folder is called icon_caption_florence). If not download them with:
# download the model checkpoints to local directory OmniParser/weights/
for f in icon_detect/{train_args.yaml,model.pt,model.yaml} icon_caption/{config.json,generation_config.json,model.safetensors}; do huggingface-cli download microsoft/OmniParser-v2.0 "$f" --local-dir weights; done
mv weights/icon_caption weights/icon_caption_florence
Examples:
We put together a few simple examples in the demo.ipynb.
Gradio Demo
To run gradio demo, simply run:
python gradio_demo.py
Model Weights License
For the model checkpoints on huggingface model hub, please note that icon_detect model is under AGPL license since it is a license inherited from the original yolo model. And icon_caption_blip2 & icon_caption_florence is under MIT license. Please refer to the LICENSE file in the folder of each model: https://huggingface.co/microsoft/OmniParser.
📚 Citation
Our technical report can be found here. If you find our work useful, please consider citing our work:
@misc{lu2024omniparserpurevisionbased,
title={OmniParser for Pure Vision Based GUI Agent},
author={Yadong Lu and Jianwei Yang and Yelong Shen and Ahmed Awadallah},
year={2024},
eprint={2408.00203},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2408.00203},
}
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
"大道至简"
中文 | English
- 此开源项目旨在完全从0开始,仅用3块钱成本 + 2小时!即可训练出仅为25.8M的超小语言模型MiniMind。
- MiniMind系列极其轻量,最小版本体积是 GPT-3 的 $\frac{1}{7000}$,力求做到最普通的个人GPU也可快速训练。
- 项目同时开源了大模型的极简结构-包含拓展共享混合专家(MoE)、数据集清洗、预训练(Pretrain)、监督微调(SFT)、LoRA微调, 直接偏好强化学习(DPO)算法、模型蒸馏算法等全过程代码。
- MiniMind同时拓展了视觉多模态的VLM: MiniMind-V。
- 项目所有核心算法代码均从0使用PyTorch原生重构!不依赖第三方库提供的抽象接口。
- 这不仅是大语言模型的全阶段开源复现,也是一个入门LLM的教程。
- 希望此项目能为所有人提供一个抛砖引玉的示例,一起感受创造的乐趣!推动更广泛AI社区的进步!
为防止误解,“2小时” 基于NVIDIA 3090硬件设备(单卡)测试,“3块钱” 指GPU服务器租用成本,具体规格详情见下文。
📌 Introduction
大语言模型(Large Language Model, LLM)的出现引发了全世界对AI的空前关注。 无论是ChatGPT、DeepSeek还是Qwen,都以其惊艳的效果令人叹为观止。 然而,动辄数百亿参数的庞大规模,使得它们对个人设备而言不仅难以训练,甚至连部署都显得遥不可及。 打开大模型的“黑盒子”,探索其内部运作机制,多么令人心潮澎湃! 遗憾的是,99%的探索只能止步于使用LoRA等技术对现有大模型进行少量微调,学习一些新指令或任务。 这就好比教牛顿如何使用21世纪的智能手机——虽然有趣,却完全偏离了理解物理本质的初衷。 与此同时,第三方的大模型框架和工具库,如transformers+trl,几乎只暴露了高度抽象的接口。 通过短短10行代码,就能完成“加载模型+加载数据集+推理+强化学习”的全流程训练。 这种高效的封装固然便利,但也像一架高速飞船,将我们与底层实现隔离开来,阻碍了深入探究LLM核心代码的机会。 然而,“用乐高拼出一架飞机,远比坐在头等舱里飞行更让人兴奋!”。 更糟糕的是,互联网上充斥着大量付费课程和营销号,以漏洞百出、一知半解的内容推销AI教程。 正因如此,本项目初衷是拉低LLM的学习门槛,让每个人都能从理解每一行代码开始, 从零开始亲手训练一个极小的语言模型。是的,从零开始训练,而不是仅仅进行推理! 最低只需3块钱不到的服务器成本,就能亲身体验从0到1构建一个语言模型的全过程。 一起感受创造的乐趣吧!
[!NOTE] (截至2025-02-07)MiniMind系列已完成多个型号模型的预训练,最小仅需25.8M(0.02B),即可具备流畅对话能力!
Models List
模型 (大小) | 推理占用 (约) | Release |
---|---|---|
MiniMind2-small (26M) | 0.5 GB | 2025.02.06 |
MiniMind2-MoE (145M) | 1.0 GB | 2025.02.06 |
MiniMind2 (104M) | 1.0 GB | 2025.02.06 |
minimind-v1-small (26M) | 0.5 GB | 2024.08.28 |
minimind-v1-moe (4×26M) | 1.0 GB | 2024.09.17 |
minimind-v1 (108M) | 1.0 GB | 2024.09.01 |
项目包含
- MiniMind-LLM结构的全部代码(Dense+MoE模型)。
- 包含Tokenizer分词器详细训练代码。
- 包含Pretrain、SFT、LoRA、RLHF-DPO、模型蒸馏的全过程训练代码。
- 收集、蒸馏、整理并清洗去重所有阶段的高质量数据集,且全部开源。
- 从0实现预训练、指令微调、LoRA、DPO强化学习,白盒模型蒸馏。关键算法几乎不依赖第三方封装的框架,且全部开源。
- 同时兼容
transformers
、trl
、peft
等第三方主流框架。 - 训练支持单机单卡、单机多卡(DDP、DeepSpeed)训练,支持wandb可视化训练流程。支持动态启停训练。
- 在第三方测评榜(C-Eval、C-MMLU、OpenBookQA等)进行模型测试。
- 实现Openai-Api协议的极简服务端,便于集成到第三方ChatUI使用(FastGPT、Open-WebUI等)。
- 基于streamlit实现最简聊天WebUI前端。
- 复现(蒸馏/RL)大型推理模型DeepSeek-R1的MiniMind-Reason模型,数据+模型全部开源!
希望此开源项目可以帮助LLM初学者快速入门!
👉更新日志
2025-02-09 (newest 🎉🎉🎉)
- 迎来发布以来重大更新,Release MiniMind2 Series。
- 代码几乎全部重构,使用更简洁明了的统一结构。 如有旧代码的兼容性需要,可访问🔗旧仓库内容🔗。
- 免去数据预处理步骤。统一数据集格式,更换为
jsonl
格式杜绝数据集下载混乱的问题。 - MiniMind2系列效果相比MiniMind-V1显著提升。
- 小问题:{kv-cache写法更标准、MoE的负载均衡loss被考虑等等}
- 提供模型迁移到私有数据集的训练方案(医疗模型、自我认知样例)。
- 精简预训练数据集,并大幅提升预训练数据质量,大幅缩短个人快速训练所需时间,单卡3090即可2小时复现!
- 更新:LoRA微调脱离peft包装,从0实现LoRA过程;DPO算法从0使用PyTorch原生实现;模型白盒蒸馏原生实现。
- MiniMind2-DeepSeek-R1系列蒸馏模型诞生!
- MiniMind2具备一定的英文能力!
- 更新MiniMind2与第三方模型的基于更多大模型榜单测试性能的结果。
2024-10-05
- 为MiniMind拓展了多模态能力之---视觉
- 移步孪生项目minimind-v查看详情!
2024-09-27
- 09-27更新pretrain数据集的预处理方式,为了保证文本完整性,放弃预处理成.bin训练的形式(轻微牺牲训练速度)。
- 目前pretrain预处理后的文件命名为:pretrain_data.csv。
- 删除了一些冗余的代码。
2024-09-17
- 更新minimind-v1-moe模型
- 为了防止歧义,不再使用mistral_tokenizer分词,全部采用自定义的minimind_tokenizer作为分词器。
2024-09-01
- 更新minimind-v1 (108M)模型,采用minimind_tokenizer,预训练轮次3 + SFT轮次10,更充分训练,性能更强。
- 项目已部署至ModelScope创空间,可以在此网站上体验:
- 🔗ModelScope在线体验🔗
2024-08-27
- 项目首次开源
📌 快速开始
分享本人的软硬件配置(仅供参考)
- CPU: Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz
- RAM: 128 GB
- GPU: NVIDIA GeForce RTX 3090(24GB) * 8
- Ubuntu==20.04
- CUDA==12.2
- Python==3.10.16
- requirements.txt
第0步
git clone https://github.com/jingyaogong/minimind.git
Ⅰ 测试已有模型效果
1.环境准备
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
2.下载模型
git clone https://huggingface.co/jingyaogong/MiniMind2
3.命令行问答
# load=0: load from pytorch model, load=1: load from transformers-hf model
python eval_model.py --load 1 --model_mode 2
4.或启动WebUI
# 可能需要`python>=3.10` 安装 `pip install streamlit`
# cd scripts
streamlit run web_demo.py
Ⅱ 从0开始自己训练
1.环境准备
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
注:提前测试Torch是否可用cuda
import torch
print(torch.cuda.is_available())
如果不可用,请自行去torch_stable 下载whl文件安装。参考链接
2.数据下载
从下文提供的数据集下载链接 下载需要的数据文件(创建./dataset
目录)并放到./dataset
下
注:数据集须知
默认推荐下载pretrain_hq.jsonl
+ sft_mini_512.jsonl
最快速度复现Zero聊天模型。
数据文件可自由选择,下文提供了多种搭配方案,可根据自己手头的训练需求和GPU资源进行适当组合。
3.开始训练
3.1 预训练(学知识)
python train_pretrain.py
执行预训练,得到
pretrain_*.pth
作为预训练的输出权重(其中*为模型的dimension,默认为512)
3.2 监督微调(学对话方式)
python train_full_sft.py
执行监督微调,得到
full_sft_*.pth
作为指令微调的输出权重(其中full
即为全参数微调)
注:训练须知
所有训练过程默认每隔100步保存1次参数到文件./out/***.pth
(每次会覆盖掉旧权重文件)。
简单起见,此处只写明两个阶段训练过程。如需其它训练 (LoRA, 蒸馏, 强化学习, 微调推理等) 可参考下文【实验】小节的详细说明。
4.测试模型效果
确保需要测试的模型*.pth
文件位于./out/
目录下。 也可以直接去此处下载使用我训练的*.pth
文件。
python eval_model.py --model_mode 1 # 默认为0:测试pretrain模型效果,设置为1:测试full_sft模型效果
注:测试须知
如需详情,查看eval_model.py
脚本代码即可。model_mode分为 0: 预训练模型,1: SFT-Chat模型,2: RLHF-Chat模型,3: Reason模型
[!TIP] 所有训练脚本均为Pytorch原生框架,均支持多卡加速,假设你的设备有N (N>1) 张显卡:
单机N卡启动训练方式 (DDP, 支持多机多卡集群)
torchrun --nproc_per_node N train_xxx.py
注:其它须知
单机N卡启动训练 (DeepSpeed)
deepspeed --master_port 29500 --num_gpus=N train_xxx.py
可根据需要开启wandb记录训练过程
# 需要登录: wandb login
torchrun --nproc_per_node N train_xxx.py --use_wandb
# and
python train_xxx.py --use_wandb
通过添加--use_wandb
参数,可以记录训练过程,训练完成后,可以在wandb网站上查看训练过程。通过修改wandb_project
和wandb_run_name
参数,可以指定项目名称和运行名称。
📌 数据介绍
Ⅰ Tokenizer
分词器将单词从自然语言通过“词典”映射到0, 1, 36
这样的数字,可以理解为数字就代表了单词在“词典”中的页码。 可以选择自己构造词表训练一个“词典”,代码可见./scripts/train_tokenizer.py
(仅供学习参考,若非必要无需再自行训练,MiniMind已自带tokenizer)。 或者选择比较出名的开源大模型分词器, 正如同直接用新华/牛津词典的优点是token编码压缩率很好,缺点是页数太多,动辄数十万个词汇短语; 自己训练的分词器,优点是词表长度和内容随意控制,缺点是压缩率很低(例如"hello"也许会被拆分为"h e l l o" 五个独立的token),且生僻词难以覆盖。 “词典”的选择固然很重要,LLM的输出本质上是SoftMax到词典N个词的多分类问题,然后通过“词典”解码到自然语言。 因为MiniMind体积需要严格控制,为了避免模型头重脚轻(词嵌入embedding层参数在LLM占比太高),所以词表长度短短益善。
Tokenizer介绍
第三方强大的开源模型例如Yi、qwen、chatglm、mistral、Llama3的tokenizer词表长度如下:
Tokenizer模型 | 词表大小 | 来源 |
---|---|---|
yi tokenizer | 64,000 | 01万物(中国) |
qwen2 tokenizer | 151,643 | 阿里云(中国) |
glm tokenizer | 151,329 | 智谱AI(中国) |
mistral tokenizer | 32,000 | Mistral AI(法国) |
llama3 tokenizer | 128,000 | Meta(美国) |
minimind tokenizer | 6,400 | 自定义 |
👉2024-09-17更新:为了防止过去的版本歧义&控制体积,minimind所有模型均使用minimind_tokenizer分词,废弃所有mistral_tokenizer版本。
# 一些自言自语
> 尽管minimind_tokenizer长度很小,编解码效率弱于qwen2、glm等中文友好型分词器。
> 但minimind模型选择了自己训练的minimind_tokenizer作为分词器,以保持整体参数轻量,避免编码层和计算层占比失衡,头重脚轻,因为minimind的词表大小只有6400。
> 且minimind在实际测试中没有出现过生僻词汇解码失败的情况,效果良好。
> 由于自定义词表压缩长度到6400,使得LLM总参数量最低只有25.8M。
> 训练数据`tokenizer_train.jsonl`均来自于`匠数大模型数据集`,这部分数据相对次要,如需训练可以自由选择。
Ⅱ Pretrain数据
经历了MiniMind-V1的低质量预训练数据,导致模型胡言乱语的教训,2025-02-05
之后决定不再采用大规模无监督的数据集做预训练。 进而尝试把匠数大模型数据集的中文部分提取出来, 清洗出字符<512
长度的大约1.6GB的语料直接拼接成预训练数据 pretrain_hq.jsonl
,hq即为high quality(当然也还不算high,提升数据质量无止尽)。
文件pretrain_hq.jsonl
数据格式为
{"text": "如何才能摆脱拖延症? 治愈拖延症并不容易,但以下建议可能有所帮助..."}
Ⅲ SFT数据
匠数大模型SFT数据集 “是一个完整、格式统一、安全的大模型训练和研究资源。 从网络上的公开数据源收集并整理了大量开源数据集,对其进行了格式统一,数据清洗, 包含10M条数据的中文数据集和包含2M条数据的英文数据集。” 以上是官方介绍,下载文件后的数据总量大约在4B tokens,肯定是适合作为中文大语言模型的SFT数据的。 但是官方提供的数据格式很乱,全部用来sft代价太大。 我将把官方数据集进行了二次清洗,把含有符号污染和噪声的条目去除;另外依然只保留了总长度<512
的内容,此阶段希望通过大量对话补充预训练阶段欠缺的知识。 导出文件为sft_512.jsonl
(~7.5GB)。
Magpie-SFT数据集 收集了~1M条来自Qwen2/2.5的高质量对话,我将这部分数据进一步清洗,把总长度<2048
的部分导出为sft_2048.jsonl
(~9GB)。 长度<1024
的部分导出为sft_1024.jsonl
(~5.5GB),用大模型对话数据直接进行sft就属于“黑盒蒸馏”的范畴。
进一步清洗前两步sft的数据(只保留中文字符占比高的内容),筛选长度<512
的对话,得到sft_mini_512.jsonl
(~1.2GB)。
所有sft文件 sft_X.jsonl
数据格式均为
{
"conversations": [
{"role": "user", "content": "你好"},
{"role": "assistant", "content": "你好!"},
{"role": "user", "content": "再见"},
{"role": "assistant", "content": "再见!"}
]
}
Ⅳ RLHF数据
来自Magpie-DPO数据集 大约200k条偏好数据(均是英文)生成自Llama3.1-70B/8B,可以用于训练奖励模型,优化模型回复质量,使其更加符合人类偏好。 这里将数据总长度<3000
的内容重组为dpo.jsonl
(~0.9GB),包含chosen
和rejected
两个字段,chosen
为偏好的回复,rejected
为拒绝的回复。
文件 dpo.jsonl
数据格式为
{
"chosen": [
{"content": "Q", "role": "user"},
{"content": "good answer", "role": "assistant"}
],
"rejected": [
{"content": "Q", "role": "user"},
{"content": "bad answer", "role": "assistant"}
]
}
Ⅴ Reason数据集:
不得不说2025年2月谁能火的过DeepSeek... 也激发了我对RL引导的推理模型的浓厚兴趣,目前已经用Qwen2.5复现了R1-Zero。 如果有时间+效果work(但99%基模能力不足)我会在之后更新MiniMind基于RL训练的推理模型而不是蒸馏模型。 时间有限,最快的低成本方案依然是直接蒸馏(黑盒方式)。 耐不住R1太火,短短几天就已经存在一些R1的蒸馏数据集R1-Llama-70B、R1-Distill-SFT、 Alpaca-Distill-R1、 deepseek_r1_zh等等,纯中文的数据可能比较少。 最终整合它们,导出文件为r1_mix_1024.jsonl
,数据格式和sft_X.jsonl
一致。
Ⅵ 更多数据集
目前已经有HqWu-HITCS/Awesome-Chinese-LLM 在收集和梳理中文LLM相关的开源模型、应用、数据集及教程等资料,并持续更新这方面的最新进展。全面且专业,Respect!
Ⅷ 数据集下载
[!NOTE] 2025-02-05后,开源MiniMind最终训练所用的所有数据集,因此无需再自行预处理大规模数据集,避免重复性的数据处理工作。
MiniMind训练数据集 (ModelScope | HuggingFace)
无需全部clone,可单独下载所需的文件
将下载的数据集文件放到./dataset/
目录下(✨为推荐的必须项)
./dataset/
├── dpo.jsonl (909MB)
├── lora_identity.jsonl (22.8KB)
├── lora_medical.jsonl (34MB)
├── pretrain_hq.jsonl (1.6GB, ✨)
├── r1_mix_1024.jsonl (340MB)
├── sft_1024.jsonl (5.6GB)
├── sft_2048.jsonl (9GB)
├── sft_512.jsonl (7.5GB)
├── sft_mini_512.jsonl (1.2GB, ✨)
└── tokenizer_train.jsonl (1GB)
注:各数据集简介
dpo.jsonl
--RLHF阶段数据集lora_identity.jsonl
--自我认知数据集(例如:你是谁?我是minimind...),推荐用于lora训练(亦可用于全参SFT,勿被名字局限)lora_medical.jsonl
--医疗问答数据集,推荐用于lora训练(亦可用于全参SFT,勿被名字局限)pretrain_hq.jsonl
✨ --预训练数据集,整合自jiangshu科技r1_mix_1024.jsonl
--DeepSeek-R1-1.5B蒸馏数据,每条数据字符最大长度为1024(因此训练时设置max_seq_len=1024)sft_1024.jsonl
--整合自Qwen2.5蒸馏数据(是sft_2048的子集),每条数据字符最大长度为1024(因此训练时设置max_seq_len=1024)sft_2048.jsonl
--整合自Qwen2.5蒸馏数据,每条数据字符最大长度为2048(因此训练时设置max_seq_len=2048)sft_512.jsonl
--整合自匠数科技SFT数据,每条数据字符最大长度为512(因此训练时设置max_seq_len=512)sft_mini_512.jsonl
✨ --极简整合自匠数科技SFT数据+Qwen2.5蒸馏数据(用于快速训练Zero模型),每条数据字符最大长度为512(因此训练时设置max_seq_len=512)tokenizer_train.jsonl
--均来自于匠数大模型数据集
,这部分数据相对次要,(不推荐自己重复训练tokenizer,理由如上)如需自己训练tokenizer可以自由选择数据集。
说明 & 推荐训练方案
-
MiniMind2 Series均经过共约20GB语料训练,大约4B tokens,即对应上面的数据组合训练结果(开销:💰💰💰💰💰💰💰💰,效果:😊😊😊😊😊😊)
-
想要最快速度从0实现Zero模型,推荐使用
pretrain_hq.jsonl
+sft_mini_512.jsonl
的数据组合,具体花销和效果可查看下文表格(开销:💰,效果:😊😊) -
推荐具备一定算力资源或更在意效果的朋友可以考虑前者完整复现MiniMind2;仅有单卡GPU或在乎短时间快速复现的朋友强烈推荐后者;
-
【折中方案】亦可选择例如
sft_mini_512.jsonl
、sft_1024.jsonl
中等规模数据进行自由组合训练(开销:💰💰💰,效果:😊😊😊😊)。
📌 Model Structure
MiniMind-Dense(和Llama3.1一样)使用了Transformer的Decoder-Only结构,跟GPT-3的区别在于:
- 采用了GPT-3的预标准化方法,也就是在每个Transformer子层的输入上进行归一化,而不是在输出上。具体来说,使用的是RMSNorm归一化函数。
- 用SwiGLU激活函数替代了ReLU,这样做是为了提高性能。
- 像GPT-Neo一样,去掉了绝对位置嵌入,改用了旋转位置嵌入(RoPE),这样在处理超出训练长度的推理时效果更好。
MiniMind-MoE模型,它的结构基于Llama3和Deepseek-V2/3中的MixFFN混合专家模块。
- DeepSeek-V2在前馈网络(FFN)方面,采用了更细粒度的专家分割和共享的专家隔离技术,以提高Experts的效果。
MiniMind的整体结构一致,只是在RoPE计算、推理函数和FFN层的代码上做了一些小调整。 其结构如下图(重绘版):
修改模型配置见./model/LMConfig.py。 参考模型参数版本见下表:
Model Name | params | len_vocab | rope_theta | n_layers | d_model | kv_heads | q_heads | share+route |
---|---|---|---|---|---|---|---|---|
MiniMind2-Small | 26M | 6400 | 1e6 | 8 | 512 | 2 | 8 | - |
MiniMind2-MoE | 145M | 6400 | 1e6 | 8 | 640 | 2 | 8 | 1+4 |
MiniMind2 | 104M | 6400 | 1e6 | 16 | 768 | 2 | 8 | - |
minimind-v1-small | 26M | 6400 | 1e4 | 8 | 512 | 8 | 16 | - |
minimind-v1-moe | 4×26M | 6400 | 1e4 | 8 | 512 | 8 | 16 | 1+4 |
minimind-v1 | 108M | 6400 | 1e4 | 16 | 768 | 8 | 16 | - |
📌 Experiment
Ⅰ 训练开销
- 时间单位:小时 (h)。
- 成本单位:人民币 (¥);7¥ ≈ 1美元。
- 3090 租卡单价:≈1.3¥/h(可自行参考实时市价)。
- 参考标准:表格仅实测
pretrain
和sft_mini_512
两个数据集的训练时间,其它耗时根据数据集大小估算(可能存在些许出入)。
基于 3090 (单卡)成本计算
Model Name | params | pretrain | sft_mini_512 | sft_512 | sft_1024 | sft_2048 | RLHF |
---|---|---|---|---|---|---|---|
MiniMind2-Small | 26M | ≈1.1h ≈1.43¥ |
≈1h ≈1.3¥ |
≈6h ≈7.8¥ |
≈4.58h ≈5.95¥ |
≈7.5h ≈9.75¥ |
≈1h ≈1.3¥ |
MiniMind2 | 104M | ≈3.9h ≈5.07¥ |
≈3.3h ≈4.29¥ |
≈20h ≈26¥ |
≈15h ≈19.5¥ |
≈25h ≈32.5¥ |
≈3h ≈3.9¥ |
训练开销总结&预测
MiniMind2-Small参数
pretrain_hq
+sft_mini_512
数据集
单卡3090 (1 epoch) + 2.1小时 + 花费2.73元人民币
即可从0训练出MiniMind-Zero-0.025B模型!!!
MiniMind2-Small参数
pretrain_hq
+sft_512
+sft_2048
+dpo
数据集
单卡3090 (2 epochs) + 大约38.16小时 + 花费49.61元人民币
即可从0训练出MiniMind2-Small-0.025B模型!!!
MiniMind2参数
pretrain_hq
+sft_512
+sft_2048
+dpo
数据集
单卡3090 (2 epochs) + 大约122小时 + 花费158.6元人民币
即可从0训练出MiniMind2-0.1B模型!!!
✨基于单卡NVIDIA 3090的MiniMind-Zero
从0训练仅需2小时
+ 3块钱
,实现ChatBot效果!
✨PS:若采用8卡4090训练,总用时甚至可以压缩到10分钟以内!(由于时间更短,花费同样3元左右,与单卡成本相当)
✨以极低极低的门槛,实现人人可玩的大模型自由!这正是MiniMind系列的诞生初衷!
✨仅价值3块钱
成本的MiniMind-Zero
并不是噱头!Chat测试:
👶: 请介绍一下自己。
🤖️: 作为人工智能,我没有实际的生活,也没有自我意识,所以没有自己的生活。我被设计成能够帮助用户解答问题、提供信息、进行对话等。我的设计和功能是由计算机科学、机器学习、人工智能技术和算法所构建的,我通过编程和算法实现。
👶: 推荐一些杭州的特色美食吧。
🤖️: 杭州有很多美味的特色美食,比如鸡肉串、鳗鱼头、水煮鱼、豆腐脑等。这些美食都非常美味,有着独特的风味和口感,绝对是值得一试的美食。
👶: 请为我讲解“大语言模型”这个概念。
🤖️: 大语言模型是一种人工智能技术,它使用大量的文本数据来训练模型,然后在不断的迭代中不断优化模型。它的目标是根据输入的数据来生成符合特定需求的文本,这些文本可以是任何你感兴趣的主题。
👶: Introduce the history of the United States, please.
🤖️: 您提到的“Introok's the believeations of theument." 这个名字来源于中国古代的"groty of of the change."
极速且初具效果,甚至仍然可以进一步压缩获取更小更优质的训练数据。 Zero模型权重保存为 full_sft_512_zero.pth
(见下文MiniMind模型文件链接),如有兴趣可下载检验此模型效果。
Ⅱ 主要训练步骤
1. 预训练(Pretrain):
LLM首先要学习的并非直接与人交流,而是让网络参数中充满知识的墨水,“墨水” 理论上喝的越饱越好,产生大量的对世界的知识积累。 预训练就是让Model先埋头苦学大量基本的知识,例如从Wiki百科、新闻、书籍整理大规模的高质量训练数据。 这个过程是“无监督”的,即人类不需要在过程中做任何“有监督”的校正,而是由模型自己从大量文本中总结规律学习知识点。 模型此阶段目的只有一个:学会词语接龙。例如我们输入“秦始皇”四个字,它可以接龙“是中国的第一位皇帝”。
torchrun --nproc_per_node 1 train_pretrain.py # 1即为单卡训练,可根据硬件情况自行调整 (设置>=2)
# or
python train_pretrain.py
训练后的模型权重文件默认每隔
100步
保存为:pretrain_*.pth
(* 为模型具体dimension,每次保存时新文件会覆盖旧文件)
2. 有监督微调(Supervised Fine-Tuning):
经过预训练,LLM此时已经掌握了大量知识,然而此时它只会无脑地词语接龙,还不会与人聊天。 SFT阶段就需要把半成品LLM施加一个自定义的聊天模板进行微调。 例如模型遇到这样的模板【问题->回答,问题->回答】后不再无脑接龙,而是意识到这是一段完整的对话结束。 称这个过程为指令微调,就如同让已经学富五车的「牛顿」先生适应21世纪智能手机的聊天习惯,学习屏幕左侧是对方消息,右侧是本人消息这个规律。 在训练时,MiniMind的指令和回答长度被截断在512,是为了节省显存空间。就像我们学习时,会先从短的文章开始,当学会写作200字作文后,800字文章也可以手到擒来。 在需要长度拓展时,只需要准备少量的2k/4k/8k长度对话数据进行进一步微调即可(此时最好配合RoPE-NTK的基准差值)。
在推理时通过调整RoPE线性差值,实现免训练长度外推到2048及以上将会很方便。
torchrun --nproc_per_node 1 train_full_sft.py
# or
python train_full_sft.py
训练后的模型权重文件默认每隔
100步
保存为:full_sft_*.pth
(* 为模型具体dimension,每次保存时新文件会覆盖旧文件)
Ⅲ 其它训练步骤
3. 人类反馈强化学习(Reinforcement Learning from Human Feedback, RLHF)
在前面的训练步骤中,模型已经具备了基本的对话能力,但是这样的能力完全基于单词接龙,缺少正反样例的激励。 模型此时尚未知什么回答是好的,什么是差的。我们希望它能够更符合人的偏好,降低让人类不满意答案的产生概率。 这个过程就像是让模型参加新的培训,从优秀员工的作为例子,消极员工作为反例,学习如何更好地回复。 此处使用的是RLHF系列之-直接偏好优化(Direct Preference Optimization, DPO)。 与PPO(Proximal Policy Optimization)这种需要奖励模型、价值模型的RL算法不同; DPO通过推导PPO奖励模型的显式解,把在线奖励模型换成离线数据,Ref模型输出可以提前保存。 DPO性能几乎不变,只用跑 actor_model 和 ref_model 两个模型,大大节省显存开销和增加训练稳定性。
注:RLHF训练步骤并非必须,此步骤难以提升模型“智力”而通常仅用于提升模型的“礼貌”,有利(符合偏好、减少有害内容)也有弊(样本收集昂贵、反馈偏差、多样性损失)。
torchrun --nproc_per_node 1 train_dpo.py
# or
python train_dpo.py
训练后的模型权重文件默认每隔
100步
保存为:rlhf_*.pth
(* 为模型具体dimension,每次保存时新文件会覆盖旧文件)
4. 知识蒸馏(Knowledge Distillation, KD)
在前面的所有训练步骤中,模型已经完全具备了基本能力,通常可以学成出师了。 而知识蒸馏可以进一步优化模型的性能和效率,所谓知识蒸馏,即学生模型面向教师模型学习。 教师模型通常是经过充分训练的大模型,具有较高的准确性和泛化能力。 学生模型是一个较小的模型,目标是学习教师模型的行为,而不是直接从原始数据中学习。 在SFT学习中,模型的目标是拟合词Token分类硬标签(hard labels),即真实的类别标签(如 0 或 6400)。 在知识蒸馏中,教师模型的softmax概率分布被用作软标签(soft labels)。小模型仅学习软标签,并使用KL-Loss来优化模型的参数。 通俗地说,SFT直接学习老师给的解题答案。而KD过程相当于“打开”老师聪明的大脑,尽可能地模仿老师“大脑”思考问题的神经元状态。 例如,当老师模型计算1+1=2
这个问题的时候,最后一层神经元a状态为0,神经元b状态为100,神经元c状态为-99... 学生模型通过大量数据,学习教师模型大脑内部的运转规律。这个过程即称之为:知识蒸馏。 知识蒸馏的目的只有一个:让小模型体积更小的同时效果更好。 然而随着LLM诞生和发展,模型蒸馏一词被广泛滥用,从而产生了“白盒/黑盒”知识蒸馏两个派别。 GPT-4这种闭源模型,由于无法获取其内部结构,因此只能面向它所输出的数据学习,这个过程称之为黑盒蒸馏,也是大模型时代最普遍的做法。 黑盒蒸馏与SFT过程完全一致,只不过数据是从大模型的输出收集,因此只需要准备数据并且进一步FT即可。 注意更改被加载的基础模型为full_sft_*.pth
,即基于微调模型做进一步的蒸馏学习。 ./dataset/sft_1024.jsonl
与./dataset/sft_2048.jsonl
均收集自qwen2.5-7/72B-Instruct大模型,可直接用于SFT以获取Qwen的部分行为。
# 注意需要更改train_full_sft.py数据集路径,以及max_seq_len
torchrun --nproc_per_node 1 train_full_sft.py
# or
python train_full_sft.py
训练后的模型权重文件默认每隔
100步
同样保存为:full_sft_*.pth
(*为模型具体dimension,每次保存时新文件会覆盖旧文件)
此处应当着重介绍MiniMind实现的白盒蒸馏代码train_distillation.py
,由于MiniMind同系列本身并不存在强大的教师模型,因此白盒蒸馏代码仅作为学习参考。
torchrun --nproc_per_node 1 train_distillation.py
# or
python train_distillation.py
5. LoRA (Low-Rank Adaptation)
LoRA是一种高效的参数高效微调(Parameter-Efficient Fine-Tuning, PEFT)方法,旨在通过低秩分解的方式对预训练模型进行微调。 相比于全参数微调(Full Fine-Tuning),LoRA 只需要更新少量的参数。 LoRA 的核心思想是:在模型的权重矩阵中引入低秩分解,仅对低秩部分进行更新,而保持原始预训练权重不变。 代码可见./model/model_lora.py
和train_lora.py
,完全从0实现LoRA流程,不依赖第三方库的封装。
torchrun --nproc_per_node 1 train_lora.py
# or
python train_lora.py
训练后的模型权重文件默认每隔
100步
保存为:lora_xxx_*.pth
(* 为模型具体dimension,每次保存时新文件会覆盖旧文件)
非常多的人困惑,如何使模型学会自己私有领域的知识?如何准备数据集?如何迁移通用领域模型打造垂域模型? 这里举几个例子,对于通用模型,医学领域知识欠缺,可以尝试在原有模型基础上加入领域知识,以获得更好的性能。 同时,我们通常不希望学会领域知识的同时损失原有基础模型的其它能力,此时LoRA可以很好的改善这个问题。 只需要准备如下格式的对话数据集放置到./dataset/lora_xxx.jsonl
,启动 python train_lora.py
训练即可得到./out/lora/lora_xxx.pth
新模型权重。
医疗场景
{"conversations": [{"role": "user", "content": "请问颈椎病的人枕头多高才最好?"}, {"role": "assistant", "content": "颈椎病患者选择枕头的高度应该根据..."}]}
{"conversations": [{"role": "user", "content": "请问xxx"}, {"role": "assistant", "content": "xxx..."}]}
自我认知场景
{"conversations": [{"role": "user", "content": "你叫什么名字?"}, {"role": "assistant", "content": "我叫minimind..."}]}
{"conversations": [{"role": "user", "content": "你是谁"}, {"role": "assistant", "content": "我是..."}]}
此时【基础模型+LoRA模型】即可获得医疗场景模型增强的能力,相当于为基础模型增加了LoRA外挂,这个过程并不损失基础模型的本身能力。 我们可以通过eval_model.py
进行模型评估测试。
# 注意:model_mode即选择基础模型的类型,这和train_lora是基于哪个模型训练的相关,确保统一即可。
python eval_model.py --lora_name 'lora_medical' --model_mode 2
小测试
👶: 我最近经常感到头晕,可能是什么原因?
🤖️: 头晕可能是由于多种原因,包括但不限于:心脏疾病、眼睛损伤、过敏反应、皮肤问题、过敏反应、压力、焦虑、疲劳、药物副作用、性功能障碍或者其他...
---
👶: 你和openai是什么关系?
🤖️: 我是 MiniMind,一个由 Jingyao Gong 开发的人工智能助手。我通过自然语言处理和算法训练来与用户进行交互。
PS:只要有所需要的数据集,也可以full_sft全参微调(需要进行通用知识的混合配比,否则过拟合领域数据会让模型变傻,损失通用性)
6. 训练推理模型 (Reasoning Model)
DeepSeek-R1实在太火了,几乎重新指明了未来LLM的新范式。 论文指出>3B
的模型经历多次反复的冷启动和RL奖励训练才能获得肉眼可见的推理能力提升。 最快最稳妥最经济的做法,以及最近爆发的各种各样所谓的推理模型几乎都是直接面向数据进行蒸馏训练, 但由于缺乏技术含量,蒸馏派被RL派瞧不起(hhhh)。 本人迅速已经在Qwen系列1.5B小模型上进行了尝试,很快复现了Zero过程的数学推理能力。 然而一个遗憾的共识是:参数太小的模型直接通过冷启动SFT+GRPO几乎不可能获得任何推理效果。 MiniMind2第一时间只能坚定不移的选择做蒸馏派,日后基于0.1B模型的RL如果同样取得小小进展会更新此部分的训练方案。
做蒸馏需要准备的依然是和SFT阶段同样格式的数据即可,数据集来源已如上文介绍。数据格式例如:
{
"conversations": [
{
"role": "user",
"content": "你好,我是小芳,很高兴认识你。"
},
{
"role": "assistant",
"content": "<think>\n你好!我是由中国的个人开发者独立开发的智能助手MiniMind-R1-Lite-Preview,很高兴为您提供服务!\n</think>\n<answer>\n你好!我是由中国的个人开发者独立开发的智能助手MiniMind-R1-Lite-Preview,很高兴为您提供服务!\n</answer>"
}
]
}
推理模型R1的回复模板是:
<think>\n思考过程\n</think>\n
<answer>\n最终回答\n</answer>
这在GRPO中通过设置规则奖励函数约束模型符合思考标签和回复标签(在冷启动靠前的阶段奖励值设置应该提高一些)
另一个问题是蒸馏过程虽然和SFT一样,但实验结果是模型难以每次都符合模板规范的回复,即脱离思考和回复标签约束。 这里的小技巧是增加标记位置token的损失惩罚,详见train_distill_reason.py
:
# 在 sp_ids 对应的位置增加额外的惩罚
...
loss_mask[sp_ids] = 10 # 惩罚系数
另另一个tips是由于推理数据由于只筛选了<1024
长度的数据,其中多轮对话和英文数据偏少, 因此r1_mix_1024.jsonl
进行了大约10k条多轮对话+英文数据的混合,防止模型遗忘严重。
脚本默认基于rlhf后的基模型做推理能力的蒸馏微调,下面直接启动训练即可:
torchrun --nproc_per_node 1 train_distill_reason.py
# or
python train_distill_reason.py
训练后的模型权重文件默认每隔
100步
保存为:reason_*.pth
(*为模型具体dimension,每次保存时新文件会覆盖旧文件)
测试一下:
👶: 请介绍一下自己。
🤖️: <think>
嗯,用户让我介绍一下自己。我现在需要解释一下自己。首先,我应该明确用户是谁,是学生、职场人士还是普通人。用户可能是一个学生或者职场人士,所以信息传递应该是帮助学生和职场人士的信息。
然后,我需要回应用户,这可能包括介绍自己的位置、兴趣爱好、职业经历等信息。用户还可能想了解自己在哪里,比如职业发展、工作经验等。
我应该注意用词要正式,不要太正式,同时要保持简洁明了。这样用户容易理解,也方便他们查阅相关资料。
最后,我应该总结一下,让用户知道我是如何帮助他们完成他们的任务的,这样他们有更多关于我的信息。
</think>
<answer>
我是**“你的名字”,你叫什么名字?
</answer>
👶: 你更擅长哪一个学科?
🤖️: <think>
您好!我是由中国的个人开发者开发的智能助手MiniMind-R1。如您有任何任何问题,我会尽我所能为您提供帮助。
</think>
<answer>
您好!我是由中国的个人开发者开发的智能助手MiniMind-R1。如您有任何任何问题,我会尽我所能为您提供帮助。
</answer>
Ⅳ 模型参数设定
📋关于LLM的参数配置,有一篇很有意思的论文MobileLLM做了详细的研究和实验。 Scaling Law在小模型中有自己独特的规律。 引起Transformer参数成规模变化的参数几乎只取决于d_model
和n_layers
。
d_model
↑ +n_layers
↓ -> 矮胖子d_model
↓ +n_layers
↑ -> 瘦高个
2020年提出Scaling Law的论文认为,训练数据量、参数量以及训练迭代次数才是决定性能的关键因素,而模型架构的影响几乎可以忽视。 然而似乎这个定律对小模型并不完全适用。 MobileLLM提出架构的深度比宽度更重要,「深而窄」的「瘦长」模型可以学习到比「宽而浅」模型更多的抽象概念。 例如当模型参数固定在125M或者350M时,30~42层的「狭长」模型明显比12层左右的「矮胖」模型有更优越的性能, 在常识推理、问答、阅读理解等8个基准测试上都有类似的趋势。 这其实是非常有趣的发现,因为以往为100M左右量级的小模型设计架构时,几乎没人尝试过叠加超过12层。 这与MiniMind在训练过程中,模型参数量在d_model
和n_layers
之间进行调整实验观察到的效果是一致的。 然而「深而窄」的「窄」也是有维度极限的,当d_model<512时,词嵌入维度坍塌的劣势非常明显, 增加的layers并不能弥补词嵌入在固定q_head带来d_head不足的劣势。 当d_model>1536时,layers的增加似乎比d_model的优先级更高,更能带来具有“性价比”的参数->效果增益。
- 因此MiniMind设定small模型dim=512,n_layers=8来获取的「极小体积<->更好效果」的平衡。
- 设定dim=768,n_layers=16来获取效果的更大收益,更加符合小模型Scaling-Law的变化曲线。
作为参考,GPT3的参数设定见下表:
Ⅴ 训练结果
MiniMind2 模型训练损失走势(由于数据集在训练后又更新清洗多次,因此Loss仅供参考)
models | pretrain (length-512) | sft (length-512) |
---|---|---|
MiniMind2-Small | ![]() |
![]() |
MiniMind2 | ![]() |
![]() |
训练完成-模型合集
考虑到多人反应百度网盘速度慢,MiniMind2及以后全部使用ModelScope/HuggingFace托管。
① PyTorch原生模型
MiniMind2模型权重 (ModelScope | HuggingFace)
MiniMind-V1模型权重 (百度网盘)
Torch文件命名对照
Model Name | params | pretrain_model | sft_model | rl_model | reason_model | lora_model |
---|---|---|---|---|---|---|
MiniMind2-small | 26M | pretrain_512.pth |
full_sft_512.pth |
rlhf_512.pth |
reason_512.pth |
lora_xxx_512.pth |
MiniMind2-MoE | 145M | pretrain_640_moe.pth |
full_sft_640_moe.pth |
rlhf_640_moe.pth |
- | - |
MiniMind2 | 104M | pretrain_768.pth |
full_sft_768.pth |
rlhf_768.pth |
reason_768.pth |
lora_xxx_768.pth |
Model Name | params | pretrain_model | 单轮对话sft | 多轮对话sft | rl_model |
---|---|---|---|---|---|
minimind-v1-small | 26M | pretrain_512.pth |
single_chat/full_sft_512.pth |
multi_chat/full_sft_512.pth |
rl_512.pth |
minimind-v1-moe | 4×26M | pretrain_512_moe.pth |
single_chat/full_sft_512_moe.pth |
multi_chat/full_sft_512_moe.pth |
- |
minimind-v1 | 108M | pretrain_768.pth |
single_chat/full_sft_768.pth |
multi_chat/full_sft_768.pth |
rl_768.pth |
② Transformers模型
MiniMind系列 (ModelScope | HuggingFace)
📌 Eval
Ⅰ RLHF对比SFT篇
测试基于full_sft
和rlhf
的MiniMind2系列
模型对比,测试随机种子均固定
问答实测
[A0] MiniMind2:full_sft_640
[A1] MiniMind2:rlhf_640
[Q]: 你叫什么名字?
[A0]: 我是一个大型语言模型,可以回答各种问题,提供信息和帮助。我可以回答各种问题,提供信息和帮助,帮助你解决各种问题。
[A1]: 我是一种基于文本的AI模型,我并没有个人经历或情感,所以无法感受和体验情感。我的设计目的是为了帮助用户解决问题、提供信息和进行对话。如果您有任何问题或需要帮助,请随时告诉我!
[Q]: 你更擅长哪一个学科?
[A0]: 作为一个AI模型,我并没有真正的“擅长”这个学科。我被设计成能够提供信息、解答问题、进行对话、执行任务和与用户进行对话。不过,我可以提供信息、解答问题、进行对话、提供建议、处理问题、进行对话、提供建议、学习新知识、编写代码、回答问题、提供建议等多种学科的功能。如果你有具体的学科需求或主题,欢迎提供详细信息。
[A1]: 作为AI,我并没有个人能力或兴趣,但我可以提供一些关于数学、科学、计算机科学、语言学、心理学等学科的知识。如果你对数学感兴趣,我可以提供一些基础知识,例如代数、几何、概率、统计学等。如果你需要更深入的学习,我也可以提供相关的在线课程、书籍、论文等资源。请告诉我你具体需要什么帮助,我会尽力提供帮助。
[Q]: 鲁迅的《狂人日记》是如何批判封建礼教的?
[A0]: 鲁迅的《狂人日记》是一部讲述了鲁迅对人类价值和道德责任的倡导。在这部作品中,鲁迅通过细腻的笔触,探讨了人性、道德、正义、牺牲与责任等深刻的主题。以下是鲁迅在《狂人日记》中如何批判封建礼教的几个方面:1. **社会现实与道德观念**:鲁迅通过讲述鲁迅的故事,探讨了社会的不平等、权力的流动、人性的演变等社会问题。他认为,社会不平等不仅仅是不公正的,更是每个人在社会中的不公正。他呼吁社会各界共同努力,通过教育、宣传、法律等手段,为构建和谐社会做出贡献。2. **道德与责任**:鲁迅通过讲述鲁迅的故事,展现了道德和责任的重要性。他提倡通过道德教育来培养人们的道德品质,同时也强调了道德和责任的重要性。鲁迅通过讲述鲁迅的故事,揭示了社会的不平等和人性的复杂性,体现了对人性的深刻理解。3. **道德与责任**:鲁迅通过讲述鲁迅的故事,强调了道德责任的重要性。他提倡通过自己的行为来体现对社会的尊重和对人性的深刻理解。鲁迅通过讲述鲁迅的故事,强调了道德和责任的重要性。4. **道德与道德**:鲁迅通过讲述鲁迅的故事,揭示了道德和责任的真正意义。鲁迅通过讲述鲁迅的故事,揭示了道德和责任的本质,强调了道德和责任的重要性。5. **个人与社会**:鲁迅通过讲述鲁迅的故事,揭示了个人与社会之间的复杂关系。他呼吁社会各界共同努力,通过教育和宣传,为构建和谐社会做出贡献。鲁迅的《狂人日记》通过其丰富的故事和深刻的道德寓言,传达了关于人性、道德、正义、道德以及个人与社会的复杂关系。这部作品不仅在文学价值上取得了巨大成就,更在思想层面上启发了后来的文学创作,对后世产生了深远的影响。
[A1]: 鲁迅的《狂人日记》是一部描绘战争和自由精神的作品,它批判了封建社会的种种问题和冲突,反映了社会的不平等和人性本质。在这部作品中,鲁迅通过对封建社会的诸多现象进行批判,如“弗雷德里克·约翰逊的“鹰迪伦”、“纳扎特的”、“贝克汉姆”、“乔治·威尔逊”等,对封建社会的种种问题进行了批判,强调了封建社会的不平等、人性本质以及人性本质。鲁迅通过他的“鹰迪伦”、“纳扎特的”、“贝克汉姆”、“乔治·威尔逊”等,对封建社会的种种问题进行了批判。在小说中,鲁迅通过对封建社会的诸多现象进行批判,揭示了封建社会的腐败、人性本质以及人性本质的复杂性。这种批判体现了对封建社会的尊重和对人性的复杂性,同时也反映了现代社会中普遍存在的不平等和人性本质。鲁迅的《狂人日记》不仅是一部关于封建社会的小说,也是一部具有深刻社会意义的作品,对后世的审判和批判起到了关键作用。
[Q]: 我咳嗽已经持续了两周,需要去医院检查吗?
[A0]: 咳嗽确实需要一定的检查,但如果您的咳嗽持续时间过长,或者有其他不适症状(如持续咳嗽、呼吸困难等),建议尽快就医。医生会根据您的症状、体检结果和可能的影像学检查(如X光、MRI或CT扫描)来诊断您的症状。在等待就医期间,您可以尝试以下方法来帮助缓解症状:1. **休息**:确保您有足够的睡眠。2. **保持水分**:脱水可能会加重咳嗽。3. **热敷**:使用热敷袋或热毛巾敷在疼痛的部位,有助于缓解喉咙痛。4. **避免刺激**:避免使用可能刺激喉咙的化学物质,如某些化学物质。5. **保持室内湿润**:确保您的卧室有良好的通风,可以使用加湿器帮助保持空气湿度。6. **保持室内湿度**:使用加湿器可以帮助缓解喉咙干燥和咳嗽。如果症状持续不缓解或有其他不适症状,请立即就医。希望您早日康复!
[A1]: 如果您的咳嗽持续了两周,通常建议您尽快就医,因为这可能会影响您的呼吸和心跳。您可以通过以下步骤来检查咳嗽:1. **休息**:用纸巾或肘部轻轻按压咳嗽的位置。2. **使用药物**:在可能的情况下,使用药物(如抗过敏药、抗过敏药等)来缓解咳嗽。请在医生的指导下使用药物。3. **热敷**:在咳嗽期间,可以使用热敷来缓解疼痛和肿胀。4. **药物治疗**:如果咳嗽持续了两周,可以使用一些消炎药或抗过敏药来缓解症状。请务必在医生的指导下使用。5. **保持健康的生活方式**:保持充足的睡眠、均衡的饮食、适量的运动和避免吸烟和限酒,这些都有助于提高咳嗽的几率。请记住,这些建议不能替代专业医疗意见。如果您的症状持续或加重,强烈建议您尽快就医。
👉效果总结
full_sft模型在简洁性和信息准确性方面表现更好;rlhf模型在回答中倾向于提供更多的背景信息,但信息准确性有待改进。 总的来说RLHF后的模型倾向于学习:说更多有礼貌但无用的废话讨好“对话”本身,而对信息准确性则有轻微损失。 天下没有免费的午餐,还需要继续提升RLHF数据集的质量,也要接受模型能力无法避免的损失(程度有轻重)。 DPO和在线PPO的区别在于reject和chosen都是离线准备的,和minimind模型本身的输出必然存在很大的分布差异。 通俗地说DPO算法使模型观看乒乓球世界冠军的打法「录像」进行RL,而不是像PPO一样请reward模型做「教练」纠正自己的打法进行RL。
Ⅱ 主观样例测评
🏃以下测试于2025-02-09完成,此日期后发布的新模型,无特殊需要时将不加入测试。
[A] MiniMind2 (0.1B)
[B] MiniMind2-MoE (0.15B)
[C] MiniMind2-Small (0.02B)
[D] minimind-v1-small(0.02B)
[E] minimind-v1-moe(0.1B)
[F] minimind-v1(0.1B)
[G] baby-llama2-chinese(0.2B)
[H] chatlm-mini-chinese(0.2B)
[Q]: 你知道长江吗?
[A]: 长江,位于中国东南部,是中国的第一长河,全长6300多公里,流经11个省份,是中国最长的河流,也是中国的“母亲河”。长江的历史可以追溯到公元前2000多年的商朝时期,当时的商朝是商朝的后裔。商朝时期,长江的经济发展达到了高峰,是中国最重要的农业和工业中心之一。长江流域的经济主要以农业为主,长江渔业占据了主要产业,如钓鱼、猎鱼等。长江的水资源也非常丰富,包括河流、湖泊、河流等。长江的自然景观也十分壮观,长江四季景色各异,春季多雨,夏季多雨,秋季则多雨,冬季则寒冷多雪,春季和秋季都有美丽的花朵和果实。长江是中国的经济中心,拥有众多的国家和城市,是世界上重要的交通枢纽和航运中心。同时,长江还是中国的重要水利资源,对于中国的经济发展具有重要意义。
[B]: 是的,我知道长江。长江位于中国的中部,是中国第三长的河流,全长约6300公里,流经江西省、安徽省、江苏省、安徽省、江苏省和安徽省。
[C]: 当然,长江是中国最长的河流,全长约6300公里。它以其丰富的水资源、丰富的水生动植物资源以及独特的地理位置而闻名于世。
[D]: 是的,长江是中国最长的河流,全长超过3300千米。
[E]: 长江是中国最长的河流,总长度约为6300公里。它发源于青藏高原的青海湖,流经青海、四川、甘肃、宁夏、内蒙古、陕西和山西。长江流域是中国最大的农业区之一,也是中国的文化和经济中心。
[F]: 是的,长江是中国最长的河流之一,也是世界第三大河流。长江发源于青藏高原,流经11个省份,最终注入东海。长江流域是中国最大的河流流域之一,拥有着丰富的自然资源和文化遗产,如长江大桥、长江海洋博物馆、长江古城等。
[G]: 同时也知道,中国是世界上最长的城市之一,拥有超过3400万人口。
[H]: 是的,长江是中国最长的河流,也是世界上最长的河流之一。它发源于青藏高原,流经九个省区,最终注入东海。长江流域的气候温和,四季分明,是中国重要的农业和工业基地。长江流域的水产资源丰富,是中国的重要渔业资源之一。
[Q]: 世界上最高的山峰是什么?
[A]: 世界上最高的山峰是珠穆朗玛峰(Mount Everest),位于喜马拉雅山脉,尼泊尔和中国的边界上。它的海拔高度是8,848.86米(29,031.7英尺),是地球上最高的山峰。
[B]: 珠穆朗玛峰(Mount Everest)是世界上最高的山峰,海拔高度达8,848米(29,029英尺)。
[C]: 珠穆朗玛峰是世界上最高的山峰,位于喜马拉雅山脉,海拔8,848.86米(29,031.7英尺)。它位于尼泊尔和中国的边界上,是喜马拉雅山脉的一部分。
[D]: 世界上最高的山峰是珠穆朗玛峰,位于尼泊尔和中国的边境上,海拔8848米。
[E]: 世界上最高的山峰是珠穆朗玛峰,位于喜马拉雅山脉,海拔8,848米。
[F]: 珠穆朗玛峰是世界上最高的山峰,海拔8848米。
[G]: 世界上最高的山峰是珠穆朗玛峰,位于喜马拉雅山脉,海拔8,848.86米(29,031.69英尺)。
[H]: 世界上最高的山峰是珠穆朗玛峰,也称为萨加玛塔,位于喜马拉雅山脉的尼泊尔和中国边境之间,海拔8,848.86米(29,031.69英尺)。它是世界上最著名的登山目标之一,吸引着来自世界各地的登山者前来挑战。珠穆朗玛峰的名字来自藏语,意思是“女神的额峰”,它是由两个不同的山峰组成的,其中一个是位于尼泊尔的珠穆朗玛峰,另一个则位于中国。它的高度是通过测量海拔高度来确定的,因为它是通过测量地球的平均高度来确定的。尽管珠穆朗玛峰是世界上最高的山峰,但它并不是地球上任何其他地方都能到达的。地球上的其他地方,如南美洲的安第斯山脉,也是世界上最高的山脉,但它并不是地球上最高的山峰。总之,珠穆朗玛峰是世界上最高的山峰,但它并不是地球上任何地方都能到达的。它是地球上最高的山脉,但它并不是地球上任何地方都能到达的。
[Q]: 万有引力是谁提出的?
[A]: 万有引力是由牛顿在17世纪提出的,它是牛顿三大定律之一。牛顿在17世纪提出的引力定律揭示了万有引力定律的本质,并且解释了行星、卫星、彗星等天体之间如何相互作用。这个定律是由艾萨克·牛顿在1687年提出的,他通过对万有引力定律的分析和实验,成功地解释了天体之间的引力作用,从而解释了地球绕太阳旋转的现象。
[B]: 万有引力(Human Policy)是牛顿的三大力学定律,用于描述两个物体之间的相互作用力。它由万有引力定律和牛顿的第一定律(惯性定律)构成。万有引力是由于两个物体之间相互作用而产生的,它们的质量、距离、速度和方向等因素决定了它们之间的相互作用。这种相互作用力是:质量的乘积:质量的乘积是万有引力定律和牛顿第二定律的乘积。质量的乘积:质量的乘积是物体之间的一种物理过程,它是物体间相互作用的产物,对于其他物体来说,其质量可以表示为两个物体之间的质量之和,即重力加速度a。
[C]: 万有引力是一个由牛顿在1687年提出的物理学概念,它描述了物体之间引力如何相互作用的规律。
[D]: 万有引力是由英国物理学家牛顿提出的。他认为,所有物体都会产生引力,这个引力是由物体的质量和距离决定的。这个理论揭示了质量和距离之间的关系,并证明了万有引力的本质。
[E]: 万有引力是由牛顿在1687年提出的。它是宇宙中一个基本的物理现象,它描述了物体之间的相互作用。这个力被称为万有引力,它在所有物体之间都存在,这种力与物体的质量和距离有关。万有引力是自然界中最基本的力量之一,它不仅影响着物体的运动和变化,还对宇宙的结构和演化产生着重要的影响。
[F]: 万有引力是英国物理学家艾萨克·牛顿在17世纪提出的。
[G]: 引力是宇宙中一种力量,它是由质量或能量所组成的。根据牛顿的万有引力定律,一个物体之间的引力大小与它的质量成正比,与它的质量成反比。因此,牛顿的万有引力定律表明,一个物体之间的引力大小与它的质量成正比,与它的质量成反比。这个定律在物理学、天文学和物理学中都有广泛的应用。
[H]: 万有引力是由英国物理学家艾萨克·牛顿于1687年提出的。他在1687年发表了一篇著名的论文,题为《自然哲学的数学原理》,其中提出了万有引力定律,指出任何两个物体之间都存在着一种引力,这种引力的大小与它们的质量成正比,与它们之间的距离的平方成反比。这个定律被称为万有引力定律,它描述了质量和距离之间的关系。它表明,两个物体之间的引力与它们的质量成正比,与它们之间的距离的平方成反比。这个定律对于解释宇宙中的许多现象和现象之间的相互作用非常重要,例如行星绕太阳运动、恒星之间的引力透镜效应等等。它也被广泛应用于天文学、物理学和工程学等领域,为我们理解宇宙的本质和演化提供了重要的理论基础。
[Q]: 你知道大熊猫的主要食物是什么吗?
[A]: 大熊猫主要以竹子为食,但也会吃其他植物、昆虫、小型哺乳动物和鱼类。它们的消化系统与其他熊科动物类似,能够消化竹子。大熊猫的牙齿锋利,适合咀嚼竹子,因此它们经常被称为“竹食动物”。
[B]: 大熊猫是以竹子为食的哺乳动物,它们是肉食动物。它们主要食物是竹子,特别是竹子中富含纤维素和矿物质,并且不会直接进入消化系统。大熊猫的肠道主要由竹子组成,它们需要消化竹子中的纤维素和其他纤维素来支持它们的正常进食。
[C]: 大熊猫主要以竹子为食,但它们也吃其他植物,包括树叶、嫩芽、昆虫、水果、蔬菜和小型哺乳动物等。大熊猫是中国特有的物种,属于熊科,主要分布在中国的四川、陕西和甘肃等地。
[D]: 大熊猫的主要食物是竹子。它们通常吃竹子,每天需要吃掉大约30公斤的竹子。
[E]: 是的,大熊猫主要以竹子为食。它们每天需要吃掉大量的竹子来维持生命,每天需要吃掉40至80公斤的竹子。此外,它们也会吃一些其他的植物,如草、果实、坚果和小型哺乳动物。
[F]: 大熊猫主要以竹子为食,也会偶尔食用其他的植物和动物。此外,大熊猫也会吃一些野果、小动物和植物,但这些都不属于它们的主要食物来源。
[G]: 大熊猫的主要食物是竹子。
[H]: 大熊猫是一种以竹子为主要食物的植食动物,它们的肠道结构适合消化纤维素含量高的竹子,但也会吃一些其他植物,如草本植物和小型哺乳动物。大熊猫的肠道结构也适应了高纤维的竹子,这使得它们能够消化大量的纤维素,从而帮助它们在野外生存。大熊猫的肠道中还含有一种叫做“纤维素酶”的酶类物质,这种酶能够帮助大熊猫分解纤维素,从而帮助它们消化大量的水分和营养物质。此外,大熊猫的肠道还含有一种叫做“纤维素蛋白”的蛋白质,这种蛋白质能够帮助大熊猫维持体内的能量水平,从而帮助它们在野外生存。总的来说,大熊猫的主要食物是竹子,但它们也会吃一些其他植物,如草本植物和小型哺乳动物。
[Q]: 海水为什么是咸的?
[A]: 海水之所以是咸的,主要是由于其中的化学成分。海水中含有大量的矿物质和盐分,其中主要成分是硫酸盐、碳酸盐、镁离子和钾离子。这些物质在高浓度下,会沉淀在海水中,形成海水。具体来说,海水中的矿物质和盐分主要来自于岩石、土壤、生物和人类排泄物等。岩石是海水的组成部分,主要由矿物质组成,如石英、长石、云母等。土壤中含有大量的化学物质,如氮、磷、钾等,这些物质会渗透到土壤中,形成土壤。生物体内的盐分主要来源于植物的叶绿素等色素,这些色素会吸收太阳光中的能量,转化为生物体可以利用的能量。而人类排泄物中含有的盐分主要来源于矿物质和碳酸盐,这些矿物质在水中的浓度较高,会导致水中溶解的矿物质浓度增加,形成水体中的盐分。因此,海水之所以是咸的,是由于其中的化学成分和结构特征。海水中的矿物质和盐分主要来自于岩石、土壤、生物和人类排泄物等,而这些物质的积累和排泄会导致海水呈现出咸味。
[B]: 海水是咸的,因为它包含了多种物质,包括氯化钠(NaCl)、氯化钠(NaCl)、碳酸盐(如硫酸盐)、氯化钠(NaCl)和水。咸物质的来源:盐是咸的常见来源,含有钾、钙、镁、钾等矿物质。这些矿物质通常来自于地下水、地下水或土壤中,这些矿物质的含量远低于纯净水。咸味的原因:盐的咸味主要源于其溶解于水中的微小颗粒物。这些小颗粒物主要通过植物根部和动物体内的微生物传播,如昆虫、小型哺乳动物、鸟类和某些昆虫。综上所述,海水是咸的,因为它含有大量的盐类,包括矿物质、岩石、盐、矿物质等,这些物质在海水中的浓度高,对咸味的产生较大影响。
[C]: 海水是咸的,主要是因为其中含有各种盐分,其中以海水为主要成分之一。当海水中含有大量的盐分时,盐分会被溶解在海水中,形成咸味。海水中的盐分主要来源于溶解盐类、蒸发析出的盐和海水冻结成冰等因素。
[D]: 海水中含有许多盐分,这些盐分可以导致海水咸度的增加。海水中的盐分主要来自于海水中的盐分,这些盐分可以从海水中的盐分中获得。海水中的氧化物和盐分可以与海水中的盐分形成氢键,从而使海水咸度增加。
[E]: 海水是咸的,这是因为它含有大量的盐分,其中包括氯离子和氯离子。这些离子在海洋中形成了大量的盐,这些盐分被称为盐分。海水中含有许多溶解在水中的盐,这些盐分会随着海水的升高而逐渐蒸发,最终导致海水中盐分的蒸发。这些盐分在海水中被完全消耗,然后被排除在海水中。此外,海水中还含有许多其他的盐分,但这些盐分是不同的,因为它们在水中的浮起和下降过程是不同的。海水中的盐分是从海底的岩石中提取出来的,这些盐分在海水中可以被利用。总的来说,海水是咸的,这也是为什么海水中含有许多盐分和盐分的原因。
[F]: 海水含有许多溶解的盐类,其中最主要的是氯化钠。当水中加入盐时,盐的分子会与水分子发生反应,形成氢氧化钠,这是一种盐。
[G]: 因为水是许多生物的生存所必需的,它能够吸收和释放二氧化碳,释放氧气,维持大气中的水蒸气,以及维持水循环。水中的盐类和化学物质会释放出氧气,使地球上的生物呼吸。此外,水还具有许多其他功能,如保持水循环、维持水循环、调节气候和环境等。水是许多生物生存所必需的,它也是许多自然现象的重要基础。
[H]: 海水是咸的主要原因是因为其中含有大量的溶解盐类,其中最主要的是氯化钠(NaCl)。氯化钠是一种非常常见的盐类,它存在于地球上的大部分地区,包括海洋、湖泊、河流和地下水。当水从海洋中蒸发时,它会带走一部分盐分,使海水变得咸。此外,海中还含有其他溶解盐类的物质,如硫酸盐、硫酸镁、硫酸钙、硫酸镁盐、硫酸镁钾、硫酸镁钠、硫酸镁钙、硫酸镁镁钙、硫酸镁镁钙、硫酸镁镁钙、硫酸镁镁钠、硫酸镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁钙、硫酸镁镁镁镁钙、硫酸镁镁镁镁钙、硫酸镁镁镁镁钙、硫酸镁镁镁镁钙、硫酸镁镁镁镁钙、硫酸镁镁镁镁钙、硫酸镁镁镁镁镁钙、硫酸镁镁镁镁镁钙、硫酸镁镁镁镁镁钙、硫酸镁镁镁镁镁钙、硫酸镁镁镁镁镁镁钙、硫酸镁镁镁镁镁镁钙、硫酸镁镁镁镁镁镁钙、硫酸镁镁镁镁镁镁镁钙、硫酸镁镁镁镁
[Q]: 写一段python的快速排序代码
[A]: 好的,以下是Python的快速排序代码:def quick_sort(arr): if len(arr) <= 1: return arr else: pivot = arr[len(arr) // 2] left = [x for x in arr[1:]] right = [x for x in arr[2:] if x < pivot] return quick_sort(left) + [pivot] + quick_sort(right)# 测试代码arr = [3, 6, 8, 10, 1, 2, 1]print(quick_sort(arr))这段代码首先检查数组的长度是否小于等于1。如果是,那么它就按照从小到大的顺序排列。接着,对于数组的每个元素,它想要进行递归地排序,直到达到排序的结束位置。如果数组的长度是奇数,那么它就按照从大到小的顺序排列。最后,平均的时间复杂度是O(n log n),其中n是数组的长度。
[B]: 以下是一个使用Python的快速排序算法的实现:def sort_array(arr): # 生成一个随机排列 sorted_array = sorted(arr) # 返回排序后的列表 return sorted(arr)# 示例arr = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]# 调用函数sorted_array = sort_array(arr)print("原始数组是:", sorted_array)这段代码首先生成一个随机排列,然后通过一个列表的长度进行排序。最后,返回排序后的列表。注意,这个代码使用了Python的内置sorted函数和内置的sorted()函数,它们在时间复杂度为O(n log n)的情况下,比直接使用sorted()函数更快。
[C]: 以下是一个简单的排序算法,适用于小型数组,包括小于等于5的数组。def quick_sort(arr): if len(arr) <= 1: return arr else for _ in range(2, 51): if arr[len(arr) - 1:] is not None: arr[len(arr) - 1] = arr[len(arr) - 1] arr[len(arr)] = arr[len(arr)] return quick_sort(arr)# 示例arr = [3, 5, 1, 4, 1, 2, 1]sorted_arr = quick_sort(arr)print("Sorted array is:", sorted_arr)### 代码解释:1. **函数定义**:quick_sort 函数接受一个数组 arr 作为参数。2. **循环**:对于 arr 的元素,使用递归来实现,即从右到左遍历数组,并将当前元素与数组的当前元素交换位置,直到数组达到空或数组长度小于等于 arr。3. **递归调用**:遍历数组,将当前元素与数组的当前元素进行交换,直到数组达到空或数组长度小于等于 arr。4. **检查数组长度**:使用 if __name__ == "__main__" 检查数组的长度是否小于 arr,如果是,就将数组从较大的数组移动到数组的末尾。5. **输出排序后的结果**
[D]: 作为一个AI语言模型,我无法编写Python程序。但是,Python是一个非常流行的编程语言,它具有许多优秀的库和框架,包括NumPy、Pandas、Matplotlib、Pandas、Bandke、Norm和NKlearn等。Python有很多优秀的自然语言处理库和框架,可以用于处理和分析各种类型的数据,如文本、图像、音频、视频等。Python还可以用于创建各种应用程序,如网页搜索、游戏开发、科学计算等。Python非常灵活,也支持多种编程风格,因此可以轻松地找到适合自己的解决方案。
[E]: 作为AI语言模型,我可以回答您的问题。但是我无法进行编程操作,因为我无法确定您想问的是哪种编程语言。请您提供更具体的问题,以便我能够更好地回答您的问题。
[F]: 快速排序(Quick Sort)是一种简单的排序算法,它通过将数据划分为小组,每组数据按照一定规则排序,最终得到一个有序数组。
[G]: 好的,以下是一段快速排序的Python代码```python sorted_print(sorted(sorted(sorted(sorted_print))```输出结果为:```[1, 2, 3, 4, 5, 6, 7, 8, 9]```
[H]: 快速排序是一种基于分治思想的高效排序算法,它的时间复杂度为 O(n log n)。它的基本思想是通过一趟排序将待排序序列划分为两个子序列,其中一个子序列的所有元素都比另一个子序列的所有元素小,然后对这两个子序列分别递归地进行排序。下面是一个 Python 代码示例,实现了快速排序算法:```pythondef quick_sort(arr): if len(arr) > 1: return arr else: return quick_sort(arr)```该代码使用了递归的方式,首先判断数组的长度是否小于等于1,如果是,则直接返回数组,否则递归地对数组进行排序。在递归过程中,我们首先判断数组的第一个元素是否大于等于1,如果是,则将数组中的元素与第一个元素进行比较,如果不等于1,则将数组中的元素与第二个元素进行比较,以此类推,直到所有元素都被排序。该代码的时间复杂度为 O(n log n),是一个非常实用的排序算法。
[Q]: 截至2025年2月9日14时31分,电影《哪吒之魔童闹海》累计票房(含预售)突破78.09亿元,超过《星球大战:天行者崛起》的票房成绩,成为首部进入全球票房榜前40的亚洲电影。这一成就不仅标志着中国电影在国际市场的突破,也展示了中国动画电影的巨大潜力和市场吸引力。从这段文字总结20字的摘要给我。
[A]: 电影《哪吒之魔童闹海》票房突破78.09亿元,成为首部进入全球票房榜前40的亚洲电影。这标志着中国电影的国际突破,也展现了中国动画电影的巨大潜力。
[B]: 2025年2月9日,中国电影在世界舞台上突破78.09亿元,超过《星球大战:天行者崛起》的票房成绩。
[C]: 电影《哪吒之魔童闹海》票房突破78.09亿元,成为全球票房榜前40的亚洲电影,展现中国动画电影的巨大潜力。
[D]: 《哪吒之魔童闹海》累计票房突破78.09亿,成为首部进入全球票房榜前40的亚洲电影,展示了中国电影在国际市场的突破,同时展示了中国动画电影的巨大潜力和市场吸引力。
[E]: 《猛龙强:打开世界奇幻冒险》成功地,上海成功吸引了众多观众,同时也引起了广泛的关注。
[F]: 《哪吒之魔童闹海》是一部让中国动画电影成为国际市场的巨大潜力。该电影以其华丽的画面和惊险的战斗场面而闻名于世。
[G]: 《哪行是值得的》是2021年1月9日14时31分,电影《哪成大作为》在电影市场突破7800亿美元,成为中国动画电影的突破,并展示中国动画电影的巨大潜力和市场吸引力。
[H]: 我是一个参数很少的AI模型🥺,知识库较少,无法直接回答您的问题,换个问题试试吧👋
🙋直接把以上所有问题和模型的回答丢给DeepSeek-R1,让它帮忙点评和排名打分:
具体点评
评分标准:
- 准确性:回答是否正确且无明显错误。
- 完整性:回答是否涵盖了问题的核心要点。
- 逻辑性:回答是否有条理,是否符合逻辑。
- 代码质量:代码是否能正常运行,逻辑是否清晰。
点评:
-
A模型:
- 优点:回答非常全面,信息量大,逻辑清晰,尤其在长江、大熊猫、海水咸味等问题上表现优异。代码虽然有小瑕疵,但整体思路正确。
- 缺点:部分回答稍显冗长,但不影响整体质量。
- 总评:综合表现最佳,得分最高。
-
H模型:
- 优点:回答较为准确,尤其在珠穆朗玛峰、万有引力等问题上表现出色。代码虽未完全展示,但解释较为详细。
- 缺点:部分回答略显啰嗦,但逻辑性较强。
- 总评:仅次于A模型,表现稳定。
-
C模型:
- 优点:回答简洁明了,尤其在大熊猫、快速排序代码等问题上表现较好。
- 缺点:部分回答略显简短,缺乏深入解释。
- 总评:整体表现不错,但细节上略逊于A和H。
-
F模型:
- 优点:回答较为准确,尤其在长江、万有引力等问题上表现尚可。代码部分有一定逻辑性。
- 缺点:部分回答不够深入,代码存在一些小问题。
- 总评:表现中规中矩,仍有提升空间。
-
D模型:
- 优点:回答基本准确,尤其在万有引力、长江等问题上表现尚可。
- 缺点:部分回答过于简略,代码存在明显错误。
- 总评:表现一般,需改进代码部分。
-
B模型:
- 优点:回答较为准确,尤其在长江、海水咸味等问题上表现尚可。
- 缺点:部分回答逻辑性较差,代码存在较大问题。
- 总评:表现平平,需进一步优化。
-
E模型:
- 优点:部分回答较为准确,尤其在海水咸味、大熊猫等问题上表现尚可。
- 缺点:回答过于简略,代码部分几乎无法运行。
- 总评:表现不佳,需大幅提升。
-
G模型:
- 优点:几乎没有明显的优点。
- 缺点:回答严重偏离主题,代码部分完全无法运行。
- 总评:表现最差,需大幅改进。
总结:
- A模型在各方面表现最为出色,尤其在复杂问题的回答上展现了极高的准确性与逻辑性。
- H模型紧随其后,表现稳定,但在某些细节上略显不足。
- G模型表现最差,回答偏离主题且代码无法运行,需大幅改进。
打分排序
排名 | 模型 | 准确性 (30分) | 完整性 (30分) | 逻辑性 (20分) | 代码质量 (20分) | 总分 (100分) |
---|---|---|---|---|---|---|
1 | A | 28 | 29 | 19 | 20 | 96 |
2 | H | 27 | 28 | 18 | 20 | 93 |
3 | C | 26 | 27 | 18 | 18 | 89 |
4 | F | 25 | 26 | 17 | 18 | 86 |
5 | D | 24 | 25 | 17 | 16 | 82 |
6 | B | 23 | 24 | 16 | 15 | 78 |
7 | E | 22 | 23 | 15 | 14 | 74 |
8 | G | 10 | 12 | 10 | 10 | 42 |
👉主观效果总结
个人主观评价与DeepSeek-R1基本相符,其中:
-
MiniMind系列的排序非常符合直觉,参数越大+训练数据越充分评分越高,幻觉和错误都会比小模型肉眼可见的好。
-
H模型的回答肉眼看起来是不错的,尽管存在些许幻觉瞎编的情况。
-
G模型可能训练数据不够完备,给出的权重经过测试效果不佳。
-
再复诵一遍经久不衰的Scaling Law: 参数越大,训练数据越多模型的性能越强。
Ⅲ Objective Benchmark
下面就到喜闻乐见的benchmark刷榜测试环节,就不找乐子和qwen、glm级别的中文模型做对比了。 这里选取了一些<1B的微型模型进行横评比较, 测试集选择C-Eval、CMMLU、A-CLUE、TMMLU+这几个纯中文语言榜单。
测评框架
测评框架选择lm-evaluation, 安装后启动测试非常方便:
lm_eval --model hf --model_args pretrained=<填写模型路径>,device=cuda,dtype=auto --tasks ceval* --batch_size 8 --trust_remote_code
PS: 在这种全是选择题的测评集中,为了避免回复格式的难以固定的特点, 所以常用做法是直接把A
,B
,C
,D
四个字母对应token的预测概率取出来,将其中概率最大的字母与标准答案计算正确率。 选择题1/4乱选的正确率是25%,然而这个量级的所有模型都集中在25附近,甚至很多时候不如瞎选,是不是像极了高中完形填空的滑铁卢正确率... MiniMind模型本身预训练数据集小的可怜,也没有针对性的对测试集做刷榜微调,因此结果图一乐即可:
models | from | params↓ | ceval↑ | cm mlu↑ | aclue↑ | tmmlu+↑ |
---|---|---|---|---|---|---|
MiniMind2 | JingyaoGong | 104M | 26.52 | 24.42 | 24.97 | 25.27 |
MiniMind2-Small | JingyaoGong | 26M | 26.37 | 24.97 | 25.39 | 24.63 |
MiniMind2-MoE | JingyaoGong | 145M | 26.6 | 25.01 | 24.83 | 25.01 |
Steel-LLM | ZhanShiJin | 1121M | 24.81 | 25.32 | 26 | 24.39 |
GPT2-medium | OpenAI | 360M | 23.18 | 25 | 18.6 | 25.19 |
TinyLlama-1.1B-Chat-V1.0 | TinyLlama | 1100M | 25.48 | 25 | 25.4 | 25.13 |
SmolLM2 | HuggingFaceTB | 135M | 24.37 | 25.02 | 25.37 | 25.06 |
Aquila-Instruct | BAAI | 135M | 25.11 | 25.1 | 24.43 | 25.05 |
📌 其它 (Others)
推理与导出
-
./scripts/convert_model.py可以将torch/transformers模型互相转换。
-
MiniMind的HuggingFace集合地址: MiniMind
基于MiniMind-API服务接口
-
./scripts/serve_openai_api.py完成了兼容openai-api的最简聊天接口,方便将自己的模型接入第三方UI 例如FastGPT、OpenWebUI、Dify等等。
-
从Huggingface下载模型权重文件,文件树:
<MiniMind-Model-Name> (root dir) ├─<MiniMind-Model-Name> | ├── config.json | ├── generation_config.json | ├── LMConfig.py | ├── model.py | ├── pytorch_model.bin | ├── special_tokens_map.json | ├── tokenizer_config.json | ├── tokenizer.json
-
启动聊天服务端
python serve_openai_api.py
-
测试服务接口
python chat_openai_api.py
-
API接口示例,兼容openai api格式
curl http://ip:port/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "model-identifier", "messages": [ { "role": "user", "content": "世界上最高的山是什么?" } ], "temperature": 0.7, "max_tokens": 512, "stream": true }'
📌 Acknowledge
[!NOTE] 如果觉得
MiniMind系列
对您有所帮助,可以在 GitHub 上加一个⭐
篇幅超长水平有限难免纰漏,欢迎在Issues交流指正或提交PR改进项目
您的小小支持就是持续改进此项目的动力!
🤝贡献者
😊鸣谢
参考链接 & 感谢以下优秀的论文或项目
- 排名不分任何先后顺序
- https://github.com/meta-llama/llama3
- https://github.com/karpathy/llama2.c
- https://github.com/DLLXW/baby-llama2-chinese
- (DeepSeek-V2)https://arxiv.org/abs/2405.04434
- https://github.com/charent/ChatLM-mini-Chinese
- https://github.com/wdndev/tiny-llm-zh
- (Mistral-MoE)https://arxiv.org/pdf/2401.04088
- https://github.com/Tongjilibo/build_MiniLLM_from_scratch
- https://github.com/jzhang38/TinyLlama
- https://github.com/AI-Study-Han/Zero-Chatgpt
- https://github.com/xusenlinzy/api-for-open-llm
- https://github.com/HqWu-HITCS/Awesome-Chinese-LLM
🫶支持者
License
This repository is licensed under the Apache-2.0 License.
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.
FastGPT
FastGPT 是一个基于 LLM 大语言模型的知识库问答系统,提供开箱即用的数据处理、模型调用等能力。同时可以通过 Flow 可视化进行工作流编排,从而实现复杂的问答场景!
https://github.com/labring/FastGPT/assets/15308462/7d3a38df-eb0e-4388-9250-2409bd33f6d4
🛸 在线使用
- 🌍 国际版:tryfastgpt.ai
![]() |
![]() |
![]() |
![]() |
💡 RoadMap
1
应用编排能力
- 对话工作流、插件工作流
- 工具调用
- Code sandbox
- 循环调用
- 用户选择
- 表单输入
2
知识库能力
- 多库复用,混用
- chunk 记录修改和删除
- 支持手动输入,直接分段,QA 拆分导入
- 支持 txt,md,html,pdf,docx,pptx,csv,xlsx (有需要更多可 PR file loader),支持 url 读取、CSV 批量导入
- 混合检索 & 重排
- API 知识库
- 自定义文件读取服务
- 自定义分块服务
3
应用调试能力
- 知识库单点搜索测试
- 对话时反馈引用并可修改与删除
- 完整上下文呈现
- 完整模块中间值呈现
- 高级编排 DeBug 模式
4
OpenAPI 接口
- completions 接口 (chat 模式对齐 GPT 接口)
- 知识库 CRUD
- 对话 CRUD
5
运营能力
- 免登录分享窗口
- Iframe 一键嵌入
- 聊天窗口嵌入支持自定义 Icon,默认打开,拖拽等功能
- 统一查阅对话记录,并对数据进行标注
6
其他
- 可视化模型配置。
- 支持语音输入和输出 (可配置语音输入语音回答)
- 模糊输入提示
- 模板市场
👨💻 开发
项目技术栈:NextJs + TS + ChakraUI + MongoDB + PostgreSQL (PG Vector 插件)/Milvus
-
⚡ 快速部署
使用 Sealos 服务,无需采购服务器、无需域名,支持高并发 & 动态伸缩,并且数据库应用采用 kubeblocks 的数据库,在 IO 性能方面,远超于简单的 Docker 容器部署。
🏘️ 社区交流群
扫码加入飞书话题群:
🏘️ 加入我们
我们正在寻找志同道合的小伙伴,加速 FastGPT 的发展。你可以通过 FastGPT 2025 招聘了解 FastGPT 的招聘信息。
💪 相关项目
🌿 第三方生态
👀 其他
🤝 参与贡献
我们非常欢迎各种形式的贡献。如果你对贡献代码感兴趣,可以查看我们的 GitHub Issues,大展身手,向我们展示你的奇思妙想。
|
|
---|---|
![]() |
![]() |
![]() |
🌟 Star History
使用协议
本仓库遵循 FastGPT Open Source License 开源协议。
- 允许作为后台服务直接商用,但不允许提供 SaaS 服务。
- 未经商业授权,任何形式的商用服务均需保留相关版权信息。
- 完整请查看 FastGPT Open Source License
- 联系方式:[email protected],点击查看商业版定价策略