Add: readme and changelog

Add: logging cleanup_media cmd
2023-09-16 15:28:43 +09:00 · 2023-09-16 15:23:51 +09:00
4 changed files with 702 additions and 2 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -0,0 +1,13 @@
+# Changelog
+
+## [0.1.0] - 2023-09-16
+### Added
+- Web interface
+- Fully featured RestFullAPI v1;
+- Monitoring free space in storage;
+- Deleting an archive or ticket also deletes physical files;
+- Flexible deployment configuration using environment variables;
+- Dockerized app, the image size is less than 150mb;
+- Support sqlite3 and PostgreSQL^15;
+- Whitenoise Static management;
+- healthcheck checking application availability;
--- a/README-ru.md
+++ b/README-ru.md
@ -0,0 +1,324 @@
+# LOGS-COLLECTOR
+
+```sh
+█░░ █▀█ █▀▀ █▀ ▄▄ █▀▀ █▀█ █░░ █░░ █▀▀ █▀▀ ▀█▀ █▀█ █▀█
+█▄▄ █▄█ █▄█ ▄█ ░░ █▄▄ █▄█ █▄▄ █▄▄ ██▄ █▄▄ ░█░ █▄█ █▀▄
+```
+###  [English lang: README.md](README.md)
+
+###  [CHANGELOG.md](CHANGELOG.md)
+
+
+## Цель
+
+Если вы являетесь разработчиком ПО которое в дальнейшем клиенты используют в своей инфраструктуре, вы должны понимать, как иногда бывает трудно изучить проблему с ПО не имея доступа к серверу на котором это ПО работает.
+
+
+Для решения этой задачи вы можете настраивать ПО на автоматическую отправку обезличенных отчетов о сбоях например использовать Sentry. Это не всегда приемлемо для клиента, к тому же информация может быть не полной или клиенту требуется повышенная конфиденциальность.
+
+
+В таком случае вы можете попросить клиента отправить вам нужные лог файлы и изучить их в последствии. Но тут возникает другая проблема вам нужен безопасный способ передачи этих файлов как для вас так и для клиента.
+Это мог быть FTP, SFTP, облако etc. Но что если вы не хотите давать клиенту данные для аутентификации и авторизации?
+
+Возможно у вас есть доступ к серверу клиента и вы можете прочитать лог файлы на месте. И казалось бы проблема решена. Но на сервере клиента могут отсутствовать инструменты для удобного изучения лог файлов.
+Даже если сотрудник поддержки может забрать себе нужные файлы и изучить их локально, возникает проблема распространения этих файлов между другими сотрудниками.
+
+Logs-collector позволяет решить эти задачи.
+
+Logs-collector является удаленным хранилищем и может принимать и отдавать файлы.
+
+
+## Термины
+- Платформа: это ПО разработанное вашей компанией
+- Тикет: это номер связанный с тикетом в вашей help desk системе
+- Архив: это загруженный лог файл (поддерживается любой формат)
+
+## Как это работает?
+
+- Создаете платформы
+- Создаете тикет связанный с платформой и номером 
+- Передаете клиенту уникальный токен тикета
+- Клиент загружает архив лог файлов
+- Скачиваете архив (находите решение проблемы)
+- Удаляете архив или тикет или отмечаете тикет решенным
+
+## Особенности
+
+- Централизованное хранилище
+- Для загрузки файла не нужно давать auth credentials
+- Каждый токен на загрузку уникален и связан только с одним тикетом
+- Токен имеет ограничение на количество попыток и время жизни
+- Загрузить файл можно из консоли или через веб
+- Полнофункциональный RestFullAPI v1
+- Мониторинг свободного пространства в хранилище
+- Удаление архива или тикета так же удаляет физические файлы
+- Приложение соответствует архитектуре приложения 12 факторов
+- Гибкая настройка развертывания переменными окружения
+- Приложение докеризировано, размер образа меньше 150mb
+- Может работать как с sqlite3 так и с PostgreSQL^15
+- Управление статикой без настройки для этого веб сервера
+- healthcheck проверка доступности приложения
+
+## Безопасность
+
+- Токен на загрузку не связан с авторизацией
+- Токен на загрузку обладает высокой энтропией.
+- Двухфакторная аутентификация для пользователей
+- Для скачивания файла - 2FA должна быть принудительно включена
+- Админ панель пропатчена на принудительное использование 2FA
+- Пользователь в контейнере является не привилегированным 
+- Стандартные методы защиты Django и DRF
+
+## Установка
+
+### Из docker образа:
+- Создайте директорию для приложения где вам удобно
+- Создайте файл docker-compose.yml в директории приложения
+- Создайте файл .env в директории приложения
+- Наполните файл .env требуемыми переменными окружения см. ниже
+
+>Пример файла с использованием хранилища докер и sqlite как база данных по умолчанию:
+
+```yaml
+version: "3"
+
+# to set environment variables:
+# create a .env file in the same directory as docker-compose.yaml
+
+services:
+  server:
+    image: mois3y/logs_collector:0.1.0
+    container_name: logs-collector
+    restart: unless-stopped
+    env_file:
+      - ./.env
+    ports:
+      - "80:8000"
+    volumes:
+      - /etc/timezone:/etc/timezone:ro  # optional
+      - /etc/localtime:/etc/localtime:ro  # optional
+      - logs_collector_data:/data
+
+volumes:
+  logs_collector_data:
+```
+
+### Из исходников:
+- Клонируйте репозиторий
+- docker-compose.yaml уже есть в директории с проектом
+- создайте в корне проекта файл .env
+- наполните .env требуемыми переменными окружения см. ниже
+- соберите образ и запустите контейнер в фоне:
+  
+```sh
+docker-compose up -d --build
+```
+- Вы можете создать свой файл и внести нужные правки:
+#### docker-compose-example-psql.yaml c PostgreSQL по умолчанию:
+
+```yaml
+services:
+  logs_collector:
+    container_name: logs-collector
+    build:
+      context: .
+      args:
+        - VERSION=${VERSION}
+        - SRC_DIR=${SRC_DIR}
+        - SCRIPTS_DIR=${SCRIPTS_DIR}
+        - APP_DIR=${APP_DIR}
+        - DATA_DIR=${DATA_DIR}
+        - WEB_PORT=${WEB_PORT}
+        - USER_NAME=${USER_NAME}
+        - USER_GROUP=${USER_GROUP}
+        - APP_UID=${APP_UID}
+        - APP_GID=${APP_GID}
+    ports:
+      - "${WEB_HOST}:${WEB_PORT}:${WEB_PORT}"
+    volumes:
+      - type: volume
+        source: logs_collector_data
+        target: ${APP_DIR}/data
+    env_file:
+      - ./.env
+    depends_on:
+      - db
+      
+  db:
+    image: postgres:15-alpine3.18
+    container_name: psql-collector
+    volumes:
+      - logs_collector_psql_data:/var/lib/postgresql/data/
+    env_file:
+      - ./.env
+
+
+volumes:
+  logs_collector_data:
+  logs_collector_psql_data:
+```
+
+#### docker-compose-example-psql.yaml c sqlite и bind-mount:
+
+```yaml
+version: "3"
+
+# to set environment variables:
+# create a .env file in the same directory as docker-compose.yaml
+
+services:
+  logs_collector:
+    container_name: logs-collector
+    build:
+      context: .
+      args:
+        - VERSION=${VERSION}
+        - SRC_DIR=${SRC_DIR}
+        - SCRIPTS_DIR=${SCRIPTS_DIR}
+        - APP_DIR=${APP_DIR}
+        - DATA_DIR=${DATA_DIR}
+        - WEB_PORT=${WEB_PORT}
+        - USER_NAME=${USER_NAME}
+        - USER_GROUP=${USER_GROUP}
+        - APP_UID=${APP_UID}
+        - APP_GID=${APP_GID}
+    ports:
+      - "${WEB_HOST}:${WEB_PORT}:${WEB_PORT}"
+    volumes:
+      - "/opt/collector/data:${DATA_DIR}"
+      - "/opt/collector/data/db.sqlite3:${DATA_DIR}/db.sqlite3"
+    env_file:
+      - /.env
+```
+
+🔴
+
+❗ВАЖНО❗
+
+
+Если вы используете bind-mount и монтируете его в хранилище приложения, помните
+пользователь в контейнере не привилегирован UID 1000 если примонтированный файл
+или директория будет принадлежать root приложение не сможет его прочитать и
+следовательно работать.
+
+В продакшн среде используйте приложение за вашим любимым обратным прокси.
+
+Просто добавьте его в стек docker-compose.yaml
+
+>Можно этого не делать, но Gunicorn рекомендуют придерживаться этого правила.
+>
+>Я солидарен с ними, так что вас предупредили)
+
+🔴
+
+## Переменные окружения:
+>Приложение можно настроить, для этого передайте следующие возможные переменные
+>окружения.
+>Если переменная не передана, будет использоваться переменная окружения по умолчанию
+
+```
+ █▀▄ ░░█ ▄▀█ █▄░█ █▀▀ █▀█ ▀
+ █▄▀ █▄█ █▀█ █░▀█ █▄█ █▄█ ▄
+```
+
+| ENV                  | DEFAULT         | INFO                     |
+| -------------------- | --------------- | ------------------------ |
+| SECRET_KEY           | j9QGbvM9Z4otb47 | ❗change this immediately|
+| DEBUG                | False           | use only False in prod   |
+| ALLOWED_HOSTS        | '*'             | list separated by commas |
+| CSRF_TRUSTED_ORIGINS |                 | list separated by commas |
+| DB_URL               |                 | url for connect db       |
+| TZ                   | 'UTC'           | server timezone          |
+
+
+
+[CSRF_TRUSTED_ORIGINS](https://docs.djangoproject.com/en/4.2/ref/settings/#csrf-trusted-origins)
+
+Требуется в среде докер в продакшн окружении
+принимает список url разделенных запятой
+>http://localhost,http://*.domain.com,http://127.0.0.1,http://0.0.0.0
+
+
+[DB_URL](https://django-environ.readthedocs.io/en/latest/quickstart.html)
+
+Нужно указывать если вы хотите использовать PostgreSQL
+Эти данные должны совпадать с переменными контейнера PostgreSQL
+
+| ENV               | VALUE          |
+| ----------------- | -------------- |
+| POSTGRES_USER     | admin          |
+| POSTGRES_PASSWORD | ddkwndkjdX7RrP |
+| POSTGRES_DB       | collector      |
+
+Пример:
+
+#### psql://admin:ddkwndkjdX7RrP@psql-collector:5432/collector
+- Протокол: **psql://**
+- Пользователь: **admin**
+- Пароль: **ddkwndkjdX7RrP**
+- IP адрес: **psql-collector**
+- Порт: **5432**
+- Имя БД: **collector**
+
+```
+█▀▀ █░█ █▄░█ █ █▀▀ █▀█ █▀█ █▄░█ ▀
+█▄█ █▄█ █░▀█ █ █▄▄ █▄█ █▀▄ █░▀█ ▄
+```
+
+| ENV                         | DEFAULT        |
+| --------------------------- | -------------- |
+| GUNICORN_BIND               | '0.0.0.0:8000' |
+| GUNICORN_BACKLOG            | 2048           |
+| GUNICORN_WORKERS            | 2              |
+| GUNICORN_WORKER_CLASS       | 'sync'         |
+| GUNICORN_WORKER_CONNECTIONS | 1000           |
+| GUNICORN_THREADS            | 1              |
+| GUNICORN_TIMEOUT            | 3600           |
+| GUNICORN_KEEPALIVE          | 2              |
+| GUNICORN_LOGLEVEL           | 'info'         |
+
+[GUNICORN_*](https://docs.gunicorn.org/en/stable/settings.html)
+
+Подробная информация о каждой переменной окружения доступна в официальной документации.
+
+GUNICORN_BIND не изменяйте это так как переменная отвечает за прослушиваемый адрес и порт внутри контейнера.
+
+GUNICORN_TIMEOUT по умолчанию установлена в 3600. Такой большой таймаут нужен для загрузки больших файлов.
+Поскольку я старался сделать приложение минималистичным и не использовать менеджер задач загрузка файла идет в один поток.
+
+Если время загрузки будет больше часа соединение разорвется, это особенность синхронной работы воркеров gunicorn если вам не хватает времени на загрузку вы можете увеличить это значение.
+
+❗ВАЖНО❗
+
+Gunicorn настроен писать в лог в следующем формате:
+```python
+'%({X-Forwarded-For}i)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
+```
+Это значит что в логе будет видно IP адрес запроса только из заголовка
+
+**X-Forwarded-For**
+
+В продакшн среде приложение должно быть за обратным прокси
+
+
+## Помощники
+В корне репозитория проекта есть директория scripts в ней лежит скрипт uploader.sh с помощью которого можно отправить файлы из консоли используя curl.
+
+Синтаксис простой:
+
+```cmd
+Usage: ./uploader.sh [options [parameters]]
+
+Options:
+
+ -f | --file     full path to upload file required
+ -t | --token    access token             required
+ -u | --url      target url               required
+ -v | --version  print version
+ -h | --help     print help
+```
+
+
+## Лицензия
+
+GNU GPL 3.0
--- a/README.md
+++ b/README.md
@ -1,3 +1,329 @@
-# logs-collector
+# LOGS-COLLECTOR

-Серверная сторона для получения и хранения лог файлов
+```sh
+█░░ █▀█ █▀▀ █▀ ▄▄ █▀▀ █▀█ █░░ █░░ █▀▀ █▀▀ ▀█▀ █▀█ █▀█
+█▄▄ █▄█ █▄█ ▄█ ░░ █▄▄ █▄█ █▄▄ █▄▄ ██▄ █▄▄ ░█░ █▄█ █▀▄
+```
+
+###  [CHANGELOG.md](CHANGELOG.md)
+
+###  [Russian lang: README.md](README-ru.md)
+
+
+## Purpose
+
+If you are a developer of software that clients later use in their infrastructure,
+you must understand how sometimes it can be difficult to research a problem
+with software without access to the server on which this software runs.
+
+To solve this problem, you can configure the software to automatically send 
+anonymized crash reports, for example, use Sentry. 
+This is not always acceptable to the client; 
+Moreover, the information may not be complete or the client 
+requires increased confidentiality.
+
+
+## Terms
+- Platform: this is software developed by your company
+- Ticket: this is the number associated with the ticket in your help desk system
+- Archive: this is an uploaded log file (any format is supported)
+
+## How it works?
+
+- Create platforms
+- Create a ticket associated with the platform and number
+- Transfer a unique ticket token to the client
+- The client downloads an archive of log files
+- Download the archive (find a solution to the problem)
+- Delete the archive or ticket or mark the ticket as resolved
+  
+## Features
+
+- Centralized storage;
+- To download a file you do not need to provide auth credentials;
+- Each download token is unique and associated with only one ticket;
+- The token has a limit on the number of attempts and lifetime;
+- You can download the file from the console or via the web;
+- Fully featured RestFullAPI v1;
+- Monitoring free space in storage;
+- Deleting an archive or ticket also deletes physical files;
+- The application follows the 12 factors application architecture;
+- Flexible deployment configuration using environment variables;
+- The application is dockerized, the image size is less than 150mb;
+- Can work with both sqlite3 and PostgreSQL^15;
+- Static management without configuration for this web server;
+- healthcheck checking application availability;
+
+## Security
+
+- The download token is not associated with authorization
+- The download token has high entropy.
+- Two-factor authentication for users
+- To download a file - 2FA must be forcibly enabled
+- The admin panel has been patched to force the use of 2FA
+- The user in the container is not privileged
+- Standard Django and DRF protection methods
+
+## Install
+
+### From the docker image:
+- Create a directory for the application wherever it is convenient for you
+- Create a docker-compose.yml file in the application directory
+- Create a .env file in the application directory
+- Fill the .env file with the required environment variables, see below
+
+>Example file using docker store and sqlite as default database:
+
+```yaml
+version: "3"
+
+# to set environment variables:
+# create a .env file in the same directory as docker-compose.yaml
+
+services:
+  server:
+    image: mois3y/logs_collector:0.1.0
+    container_name: logs-collector
+    restart: unless-stopped
+    env_file:
+      - ./.env
+    ports:
+      - "80:8000"
+    volumes:
+      - /etc/timezone:/etc/timezone:ro  # optional
+      - /etc/localtime:/etc/localtime:ro  # optional
+      - logs_collector_data:/data
+
+volumes:
+  logs_collector_data:
+```
+
+### From the source:
+- Clone the repository
+- docker-compose.yaml is already in the project directory
+- create a .env file in the project root
+- fill .env with the required environment variables, see below
+- build the image and run the container in the background:
+  
+```sh
+docker-compose up -d --build
+```
+- You can create your own file and make the necessary edits:
+#### docker-compose.yaml PostgreSQL by default:
+
+```yaml
+services:
+  logs_collector:
+    container_name: logs-collector
+    build:
+      context: .
+      args:
+        - VERSION=${VERSION}
+        - SRC_DIR=${SRC_DIR}
+        - SCRIPTS_DIR=${SCRIPTS_DIR}
+        - APP_DIR=${APP_DIR}
+        - DATA_DIR=${DATA_DIR}
+        - WEB_PORT=${WEB_PORT}
+        - USER_NAME=${USER_NAME}
+        - USER_GROUP=${USER_GROUP}
+        - APP_UID=${APP_UID}
+        - APP_GID=${APP_GID}
+    ports:
+      - "${WEB_HOST}:${WEB_PORT}:${WEB_PORT}"
+    volumes:
+      - type: volume
+        source: logs_collector_data
+        target: ${APP_DIR}/data
+    env_file:
+      - ./.env
+    depends_on:
+      - db
+      
+  db:
+    image: postgres:15-alpine3.18
+    container_name: psql-collector
+    volumes:
+      - logs_collector_psql_data:/var/lib/postgresql/data/
+    env_file:
+      - ./.env
+
+
+volumes:
+  logs_collector_data:
+  logs_collector_psql_data:
+```
+
+#### docker-compose-example-psql.yaml c sqlite и bind-mount:
+
+```yaml
+version: "3"
+
+# to set environment variables:
+# create a .env file in the same directory as docker-compose.yaml
+
+services:
+  logs_collector:
+    container_name: logs-collector
+    build:
+      context: .
+      args:
+        - VERSION=${VERSION}
+        - SRC_DIR=${SRC_DIR}
+        - SCRIPTS_DIR=${SCRIPTS_DIR}
+        - APP_DIR=${APP_DIR}
+        - DATA_DIR=${DATA_DIR}
+        - WEB_PORT=${WEB_PORT}
+        - USER_NAME=${USER_NAME}
+        - USER_GROUP=${USER_GROUP}
+        - APP_UID=${APP_UID}
+        - APP_GID=${APP_GID}
+    ports:
+      - "${WEB_HOST}:${WEB_PORT}:${WEB_PORT}"
+    volumes:
+      - "/opt/collector/data:${DATA_DIR}"
+      - "/opt/collector/data/db.sqlite3:${DATA_DIR}/db.sqlite3"
+    env_file:
+      - /.env
+```
+
+🔴
+
+❗IMPORTANT❗
+
+If you are using bind-mount and mounting it to your application's storage,
+remember user in container is not privileged UID 1000 if mounted file
+or the directory will belong to the root
+application will not be able to read it and therefore work.
+
+In a production environment, use the application behind your favorite reverse proxy.
+
+Just add it to the docker-compose.yaml stack
+
+>You don't have to do this, but Gunicorn recommends following this rule.
+>
+>I agree with them, so you have been warned)
+
+🔴
+
+## Environment:
+>The application can be configured,
+>to do this, pass the following possible variables surroundings.
+>If no variable is passed, the default environment variable will be used
+
+```
+ █▀▄ ░░█ ▄▀█ █▄░█ █▀▀ █▀█ ▀
+ █▄▀ █▄█ █▀█ █░▀█ █▄█ █▄█ ▄
+```
+
+| ENV                  | DEFAULT         | INFO                     |
+| -------------------- | --------------- | ------------------------ |
+| SECRET_KEY           | j9QGbvM9Z4otb47 | ❗change this immediately|
+| DEBUG                | False           | use only False in prod   |
+| ALLOWED_HOSTS        | '*'             | list separated by commas |
+| CSRF_TRUSTED_ORIGINS |                 | list separated by commas |
+| DB_URL               |                 | url for connect db       |
+| TZ                   | 'UTC'           | server timezone          |
+
+
+
+[CSRF_TRUSTED_ORIGINS](https://docs.djangoproject.com/en/4.2/ref/settings/#csrf-trusted-origins)
+
+Required in a Docker environment in a production environment
+accepts a list of urls separated by commas
+>http://localhost,http://*.domain.com,http://127.0.0.1,http://0.0.0.0
+
+
+[DB_URL](https://django-environ.readthedocs.io/en/latest/quickstart.html)
+
+Must be specified if you want to use PostgreSQL
+This data must match the PostgreSQL container variables
+
+| ENV               | VALUE          |
+| ----------------- | -------------- |
+| POSTGRES_USER     | admin          |
+| POSTGRES_PASSWORD | ddkwndkjdX7RrP |
+| POSTGRES_DB       | collector      |
+
+Example:
+
+#### psql://admin:ddkwndkjdX7RrP@psql-collector:5432/collector
+- Protocol: **psql://**
+- User: **admin**
+- Password: **ddkwndkjdX7RrP**
+- Address: **psql-collector**
+- Port: **5432**
+- Database name: **collector**
+
+```
+█▀▀ █░█ █▄░█ █ █▀▀ █▀█ █▀█ █▄░█ ▀
+█▄█ █▄█ █░▀█ █ █▄▄ █▄█ █▀▄ █░▀█ ▄
+```
+
+| ENV                         | DEFAULT        |
+| --------------------------- | -------------- |
+| GUNICORN_BIND               | '0.0.0.0:8000' |
+| GUNICORN_BACKLOG            | 2048           |
+| GUNICORN_WORKERS            | 2              |
+| GUNICORN_WORKER_CLASS       | 'sync'         |
+| GUNICORN_WORKER_CONNECTIONS | 1000           |
+| GUNICORN_THREADS            | 1              |
+| GUNICORN_TIMEOUT            | 3600           |
+| GUNICORN_KEEPALIVE          | 2              |
+| GUNICORN_LOGLEVEL           | 'info'         |
+
+[GUNICORN_*](https://docs.gunicorn.org/en/stable/settings.html)
+
+Detailed information about each environment variable is available in
+the official documentation.
+
+**GUNICORN_BIND** do not change this since the variable 
+is responsible for the listening address and port inside the container.
+
+**GUNICORN_TIMEOUT** is set to 3600 by default.
+Such a large timeout is needed to download large files.
+Since I tried to make the application minimalistic and not use a task manager,
+the file is downloaded in one thread.
+
+If the loading time is more than an hour, the connection will be broken,
+this is a feature of the synchronous operation of gunicorn workers;
+if you do not have enough time to load, you can increase this value.
+
+
+❗IMPORTANT❗
+
+Gunicorn is configured to write to the log in the following format:
+```python
+'%({X-Forwarded-For}i)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
+```
+This means that the log will show the IP address of the request only from the header
+
+**X-Forwarded-For**
+
+In a production environment, the application must be behind a reverse proxy
+
+
+## Helpers
+At the root of the project repository there is a scripts directory,
+it contains the uploader.sh script with which you can send files
+from the console using **curl**.
+
+The syntax is simple:
+
+```cmd
+Usage: ./uploader.sh [options [parameters]]
+
+Options:
+
+ -f | --file     full path to upload file required
+ -t | --token    access token             required
+ -u | --url      target url               required
+ -v | --version  print version
+ -h | --help     print help
+```
+
+
+
+
+## License
+
+GNU GPL 3.0
--- a/logs_collector/collector/management/commands/cleanup_media.py
+++ b/logs_collector/collector/management/commands/cleanup_media.py
@ -1,4 +1,5 @@
 import os
+import logging
 from django.core.management.base import BaseCommand
 from django.apps import apps
 from django.db.models import Q
@ -6,6 +7,31 @@ from django.conf import settings
 from django.db.models import FileField


+logger = logging.getLogger(__name__)
+
+logging.config.dictConfig({
+    'version': 1,
+    'disable_existing_loggers': False,
+    'formatters': {
+        'console': {
+            'format': '%(asctime)s %(name)-12s %(levelname)-8s %(message)s'
+        },
+    },
+    'handlers': {
+        'console': {
+            'class': 'logging.StreamHandler',
+            'formatter': 'console'
+        },
+    },
+    'loggers': {
+        '': {
+            'level': 'INFO',
+            'handlers': ['console']
+        }
+    }
+})
+
+
 class Command(BaseCommand):
    # HELP MESSAGE:
    help_part1 = 'This command deletes all media files from'
@ -14,10 +40,12 @@ class Command(BaseCommand):
    help = f'{help_part1} {help_part2} {help_part3}'

    def handle(self, *args, **options):
+        logger.info('Start cleanup storage....')
        all_models = apps.get_models()
        physical_files = set()
        db_files = set()
        # Get all files from the database
+        logger.info('Get all files from the database....')
        for model in all_models:
            file_fields = []
            filters = Q()
@ -35,7 +63,9 @@ class Command(BaseCommand):
                    flat=True
                ).distinct()
                db_files.update(files)
+        logger.info(f'Find: {len(db_files)} files from the database')
        # Get all files from the MEDIA_ROOT, recursively
+        logger.info('Get all files from the MEDIA_ROOT, recursively....')
        media_root = getattr(settings, 'MEDIA_ROOT', None)
        if media_root is not None:
            for relative_root, dirs, files in os.walk(media_root):
@ -46,14 +76,21 @@ class Command(BaseCommand):
                        os.path.relpath(relative_root, media_root), file_
                    )
                    physical_files.add(relative_file)
+        logger.info(f'Find: {len(physical_files)} files from the MEDIA_ROOT')
        # Compute the difference and delete those files
+        logger.info('Compute the difference and delete those files....')
        deletables = physical_files - db_files
+        logger.info(f'Find: {len(deletables)} orphan files')
        if deletables:
            for file_ in deletables:
+                logger.info(f"Delete orphan file: {file_}")
                os.remove(os.path.join(media_root, file_))
            # Bottom-up - delete all empty folders
+            logger.info('Bottom-up - delete all empty folders....')
            for relative_root, dirs, files in os.walk(
                    media_root, topdown=False):
                for dir_ in dirs:
                    if not os.listdir(os.path.join(relative_root, dir_)):
                        os.rmdir(os.path.join(relative_root, dir_))
+            logger.info('Done! Storage has been cleaned up')
+        logger.info('Done! Nothing to delete')
Author	SHA1	Message	Date
MOIS3Y	eaeecc926a	Add: readme and changelog	2023-09-16 15:28:43 +09:00
MOIS3Y	51950cb7d2	Add: logging cleanup_media cmd	2023-09-16 15:23:51 +09:00