mcp-registry/omniparser-autogui-mcp

    ==================
      
       /// MCP ///
      /// OMN ///
        
    ==================
        
    [server:online]
    [protocol:ready]

omniparser-autogui-mcp

by NON906

Python-based MCP server that uses Microsoft OmniParser to visually parse the active screen and then drives the GUI automatically (mouse / keyboard). Primarily targeted at Windows but can run cross-platform.

51
9
Open Source

Installation

1. Prerequisites
• Python ≥3.9 installed and added to PATH
• Windows/macOS/Linux desktop session (the server controls the local GUI)
2. Clone the repository
git clone https://github.com/NON906/omniparser-autogui-mcp.git cd omniparser-autogui-mcp
3. Install Python dependencies
# If a requirements.txt file exists pip install -r requirements.txt # —or— build an editable install pip install -e .
4. Configuration (optional)
• Create .env in the project root to override defaults such as server port, screen-capture interval, etc.
• If headless execution is required on Linux, start a virtual display (e.g., xvfb-run python server.py)
Example: MCP_PORT=8765 MCP_AUTH_TOKEN=your-long-token
5. Launch the MCP server
python server.py # or the main entrypoint (check README/cli.py)
6. Connect a client
Point any MCP-compatible client at ws://<host>:<MCP_PORT> using the auth token you configured.

Documentation

License: MIT License
Updated 7/30/2025
omniparser-autogui-mcp - MCP Server Registry - Augment Code