From fa6007c1f3119958c8c6fa4482e8cd691db6b9b1 Mon Sep 17 00:00:00 2001 From: Francisco Pessano Date: Sat, 12 Jul 2025 03:05:55 +0000 Subject: [PATCH] Revise README to reflect YouTube Video Classifier features and setup instructions --- README.md | 391 +++++++++++++++++++++++++++++------------------------- 1 file changed, 210 insertions(+), 181 deletions(-) diff --git a/README.md b/README.md index 091bd00..70d86b5 100644 --- a/README.md +++ b/README.md @@ -1,185 +1,214 @@ -# AutoDelete YouTube Videos 🗑️🎬 +# YouTube Video Classifier -This script automates the process of deleting videos from your YouTube "Watch Later" playlist using PyAutoGUI. +An AI-powered tool that automatically classifies YouTube videos in your "Watch Later" playlist based on their titles and thumbnails using vision-language models through Ollama. + +## Features ✨ + +- 🤖 **AI-Powered Classification**: Uses Ollama with Qwen2.5-VL and fallback models to analyze video titles and thumbnails +- 🔄 **Robust LLM Integration**: Automatic fallback between models with increasing timeouts for reliability +- 📊 **Comprehensive CSV Storage**: Saves detailed video information including classifications, metadata, and thumbnails +- 🌐 **Multi-language Detection**: Automatically detects video language using AI +- 🏷️ **Smart Tagging**: Generates detailed sub-tags for better content organization +- 🎯 **Smart Categories**: Uses existing classifications or creates new ones automatically +- 🖥️ **Browser Automation**: Selenium-based interaction with YouTube for reliable data extraction +- 🎨 **Beautiful Logging**: Rich console output with colors and emojis for better UX +- ⌨️ **Easy Control**: Press 'q' at any time to safely quit the process + +## Quick Start + +### Prerequisites +- Python 3.11.10+ +- Ollama installed locally +- Chrome or Chromium browser + +### Setup + +1. **Install Ollama**: Download from [https://ollama.ai](https://ollama.ai) + +2. **Pull Required Models**: + ```bash + ollama pull qwen2.5vl:7b + ollama pull gemma2:2b + ``` + +3. **Start Ollama Service**: + ```bash + ollama serve + ``` + +4. **Clone and Setup Project**: + ```bash + git clone + cd youtube-video-classifier + + # Create virtual environment + python -m venv venv + source venv/bin/activate # On Windows: venv\Scripts\activate + + # Install dependencies + pip install -r requirements.txt + ``` + +5. **Configure Settings** (optional): + Edit `config.ini` to customize your setup + +6. **Run the Classifier**: + ```bash + python script.py + ``` + +## How It Works 🔄 + +1. **Browser Initialization**: Opens Chrome/Chromium and navigates to your YouTube "Watch Later" playlist +2. **Video Detection**: Finds and extracts information from playlist videos using Selenium +3. **Data Extraction**: Captures video title, thumbnail, channel info, duration, and upload date +4. **AI Analysis**: Uses Ollama models to: + - Classify the video into categories + - Detect the primary language + - Generate detailed sub-tags +5. **Smart Fallback**: If primary model fails/times out, automatically switches to fallback model +6. **Data Storage**: Saves all information to CSV with base64-encoded thumbnails +7. **Playlist Management**: Removes processed videos from "Watch Later" playlist +8. **Continuous Processing**: Continues until all videos are processed or user quits + +## Configuration + +The `config.ini` file allows you to customize various settings: + +```ini +[DEFAULT] +# Ollama settings +ollama_host = http://localhost:11434 +ollama_model = qwen2.5vl:7b +ollama_fallback_model = gemma2:2b + +# File paths +classifications_csv = video_classifications.csv +playlist_url = https://www.youtube.com/playlist?list=WL + +# LLM timeout settings (in seconds) +llm_primary_timeout = 60 +llm_fallback_timeout = 60 + +# Processing settings +enable_delete = false +enable_playlist_creation = false +``` + +## CSV Output Format 📋 + +The script creates a comprehensive CSV file with the following columns: + +- `video_title`: Title of the video +- `video_url`: YouTube URL of the video +- `thumbnail_url`: Path to the saved thumbnail +- `classification`: AI-generated category +- `language`: Detected language of the video +- `channel_name`: Name of the YouTube channel +- `channel_link`: URL to the channel +- `video_length_seconds`: Duration in seconds +- `video_date`: Upload date +- `detailed_subtags`: AI-generated specific tags +- `playlist_name`: Source playlist name +- `playlist_link`: Source playlist URL +- `image_data`: Base64-encoded thumbnail data +- `timestamp`: When the classification was made + +## File Structure 📁 + +``` +├── script.py # Main classification script +├── config.ini # Configuration settings +├── requirements.txt # Python dependencies +├── video_classifications.csv # Generated results (created when first run) +└── README.md # This file +``` + +## Features in Detail + +### AI Classification System +- **Primary Model**: Qwen2.5-VL 7B for high-quality vision-language analysis +- **Fallback Model**: Gemma2 2B for faster processing when primary model is slow +- **Timeout Management**: Automatically increases timeout periods if models are struggling +- **Continuous Retry**: Keeps trying until successful or user cancels + +### Data Extraction +- **Video Metadata**: Title, URL, duration, upload date +- **Channel Information**: Name and link to channel +- **Thumbnail Capture**: Screenshots saved as base64 in CSV +- **Playlist Context**: Source playlist name and URL + +### Browser Automation +- **Multiple Chrome Paths**: Automatically finds Chrome/Chromium installation +- **WebDriver Management**: Handles chromedriver setup and fallbacks +- **Robust Selectors**: Multiple CSS selectors for reliable element finding +- **Error Recovery**: Graceful handling of UI changes and loading delays + +### User Experience +- **Rich Console Output**: Colored logging with emojis and status indicators +- **Progress Tracking**: Clear indication of current processing status +- **Safe Exit**: Press 'q' at any time to cleanly stop processing +- **Error Reporting**: Detailed error messages for troubleshooting + +## Testing Your Setup + +Before running the main script, you can test individual components: + +1. **Test Ollama Connection**: + ```python + import requests + response = requests.get('http://localhost:11434/api/tags') + print(response.json()) + ``` + +2. **Test Browser Automation**: + Run the script and check if Chrome opens correctly + +3. **Test Model Response**: + The script will verify model availability on startup + +## Troubleshooting 🔧 + +### Common Issues + +**Ollama Connection Error**: +- Ensure Ollama is running: `ollama serve` +- Check the host URL in config.ini +- Verify models are installed: `ollama list` + +**Browser Issues**: +- Install Chrome or Chromium +- Update chromedriver if needed +- Check if browser is in PATH + +**Model Timeout**: +- The script automatically handles timeouts with fallback +- Consider increasing timeout values in config.ini +- Ensure sufficient system resources + +**Selenium Errors**: +- YouTube may have changed their HTML structure +- Check for browser updates +- Verify you're logged into YouTube + +### Performance Tips + +- **For faster processing**: Use smaller models like `gemma2:2b` as primary +- **For better accuracy**: Use larger models like `qwen2.5vl:7b` as primary +- **For stability**: Keep both models installed for automatic fallback +- **For large playlists**: Consider running in smaller batches + +## Contributing + +1. Fork the repository +2. Create a feature branch +3. Test your changes thoroughly +4. Submit a pull request + +## License + +MIT License - see LICENSE file for details --- -## Requirements 📦 - -- Python **3.11.10** 🐍 - ---- - -## Setup & Usage (English) 🇬🇧 - -### ⚙️ Requirements - -- Python **3.11.10** 🐍 - -### 1️⃣ Create a Virtual Environment - -**With `venv`:** -```bash -python3.11 -m venv venv -``` - -**With `virtualenv`:** -```bash -python3.11 -m pip install virtualenv -python3.11 -m virtualenv venv -``` - -### 2️⃣ Activate the Virtual Environment - -**On Linux/macOS:** -```bash -source venv/bin/activate -``` -**On Windows:** -```bash -venv\Scripts\activate -``` - -### 3️⃣ Install Dependencies - -```bash -pip install -r requirements.txt -``` - -### 4️⃣ Run the Script 🚀 - -```bash -python script.py -``` - ---- - -## Features & Customization 🛠️ - -### Features - -- 🖼️ **Image Recognition:** - Uses screenshots in the `img` folder to locate and interact with UI elements (browser icon, YouTube buttons, etc). - -- 🔍 **Region-based Search:** - The script splits the screen into left and right halves to speed up image searches. - -- 🗂️ **Automatic Tab Handling:** - Automatically opens/closes tabs and navigates to your "Watch Later" playlist. - -- ⏱️ **Timing Controls:** - You can adjust `sleep` and `duration` values in the code to match your PC's speed. - -- 🛑 **Hotkey Listener:** - Press `q` at any time to safely stop the script. - -- 🖱️ **Easy Customization:** - Change the images in the `img` folder or tweak the logic in functions like `locate_img` and `change_to_not_available` to adapt to UI changes or other platforms. - -### Customization - -- 🖼️ **Browser Image:** - Replace the browser icon image in the `img` folder with a screenshot of your browser's icon. Make sure the filename matches the value of the `browser_img` variable in `script.py` (e.g., `brave.png` for Brave browser). - -- 🔗 **Playlist URL:** - You can change the playlist URL by editing the value of the `playlist_url` variable at the top of the `script.py` file. - -- ⏱️ **Adjust Timing:** - The script uses `sleep` and `duration` values to wait for your PC to respond. You may need to increase or decrease these values depending on your computer's speed and internet connection. - -- 📌 **Pin Your Browser:** - For the script to work, your browser must be pinned to your taskbar. - ---- - -# AutoDelete YouTube Videos 🗑️🎬 - -Este script automatiza el proceso de eliminar videos de tu lista de "Ver más tarde" en YouTube usando PyAutoGUI. - ---- - -## Requisitos 📦 - -- Python **3.11.10** 🐍 - ---- - -## Configuración y Uso (Español) 🇪🇸 - -### ⚙️ Requisitos - -- Python **3.11.10** 🐍 - -### 1️⃣ Crear un Entorno Virtual - -**Con `venv`:** -```bash -python3.11 -m venv venv -``` - -**Con `virtualenv`:** -```bash -python3.11 -m pip install virtualenv -python3.11 -m virtualenv venv -``` - -### 2️⃣ Activar el Entorno Virtual - -**En Linux/macOS:** -```bash -source venv/bin/activate -``` -**En Windows:** -```bash -venv\Scripts\activate -``` - -### 3️⃣ Instalar las Dependencias - -```bash -pip install -r requirements.txt -``` - -### 4️⃣ Ejecutar el Script 🚀 - -```bash -python script.py -``` - ---- - -## Funcionalidades y Personalización 🛠️ - -### Funcionalidades - -- 🖼️ **Reconocimiento de Imágenes:** - Usa capturas en la carpeta `img` para localizar e interactuar con elementos de la interfaz (icono del navegador, botones de YouTube, etc). - -- 🔍 **Búsqueda por Regiones:** - El script divide la pantalla en mitades izquierda y derecha para acelerar la búsqueda de imágenes. - -- 🗂️ **Manejo Automático de Pestañas:** - Abre/cierra pestañas y navega automáticamente a tu lista de "Ver más tarde". - -- ⏱️ **Control de Tiempos:** - Puedes ajustar los valores de `sleep` y `duration` en el código según la velocidad de tu PC. - -- 🛑 **Escucha de Teclas:** - Presiona `q` en cualquier momento para detener el script de forma segura. - -- 🖱️ **Fácil Personalización:** - Cambia las imágenes en la carpeta `img` o ajusta la lógica en funciones como `locate_img` y `change_to_not_available` para adaptarlo a cambios en la interfaz o a otras plataformas. - -### Personalización - -- 🖼️ **Imagen del Navegador:** - Reemplaza la imagen del icono de tu navegador en la carpeta `img` con una captura de pantalla del icono de tu navegador. Asegúrate de que el nombre del archivo coincida con el valor de la variable `browser_img` en `script.py` (por ejemplo, `brave.png` para el navegador Brave). - -- 🔗 **URL de la Playlist:** - Puedes cambiar la URL de la playlist que se usa modificando el valor de la variable `playlist_url` al inicio del archivo `script.py`. - -- ⏱️ **Ajusta los Tiempos:** - El script utiliza valores de `sleep` y `duration` para esperar la respuesta de tu PC. Puede que necesites aumentar o disminuir estos valores dependiendo de la velocidad de tu computadora y conexión a internet. - -- 📌 **Ancla tu Navegador:** - Para que el script funcione, es indispensable que tengas tu navegador anclado a tu barra de tareas. \ No newline at end of file +**Note**: This tool is for personal use and educational purposes. Please respect YouTube's Terms of Service and rate limits. \ No newline at end of file