mirror of
https://github.com/nethunterzist/trendyol-analiz
synced 2026-07-01 01:17:04 +00:00
feat: tek birleştirilmiş JSON yapısına geçiş + sosyal kanıt fallback
Ne yaptık:
- data_consolidator.py: Tüm normalizasyon ve hesaplama mantığını main.py'den çıkardık
- Dashboard endpoint 1150 satırdan 25 satıra düştü (main.py -1730/+1880 net)
- Enrichment bitince otomatik konsolide dosya oluşturuluyor (report_{id}_data.json)
- Eski raporlar ilk dashboard isteğinde lazy migration ile konsolide ediliyor
- Trendyol API artık order-count döndürmediği için baskets fallback eklendi
- Inline socialProofs (scrape) > enrichment API öncelik sırası uygulandı
- Frontend KPI başlıkları orders/baskets durumuna göre dinamik değişiyor
- logging_config.py, category_seeder.py, alembic migration eklendi
- Playwright ile 9 tab test edildi, tüm veriler doğru
Neden yaptık:
- 3 farklı kaynaktan her istekte birleştirme yapılması veri tutarsızlığına ve yavaşlığa yol açıyordu
- Tek konsolide JSON dosyası ile dashboard anında yükleniyor
- Trendyol API değişikliği nedeniyle sipariş verisi kayboluyordu, baskets fallback ile çözüldü
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
137
CLAUDE.md
137
CLAUDE.md
@@ -1,12 +1,12 @@
|
||||
# CLAUDE.md
|
||||
|
||||
Bu dosya Claude Code (claude.ai/code) için proje rehberidir.
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Proje Özeti
|
||||
|
||||
**Trendyol Product Dashboard**: Trendyol e-ticaret platformu için kategori bazlı ürün analiz sistemi. 7 tab'lı dashboard, otomatik rapor oluşturma ve sosyal kanıt metrikleri.
|
||||
**Trendyol Product Dashboard**: Trendyol e-ticaret platformu için kategori bazlı ürün analiz sistemi. 9 tab'lı dashboard, otomatik rapor oluşturma, sosyal kanıt metrikleri ve hidden champion analizi.
|
||||
|
||||
**Stack**: FastAPI + React 19 + Vite + SQLite + Tailwind CSS
|
||||
**Stack**: FastAPI + React 19 + Vite + PostgreSQL + Tailwind CSS
|
||||
|
||||
## Geliştirme Komutları
|
||||
|
||||
@@ -15,17 +15,33 @@ Bu dosya Claude Code (claude.ai/code) için proje rehberidir.
|
||||
python3 start.py
|
||||
|
||||
# Manuel başlatma (iki terminal)
|
||||
cd backend && python3 main.py # Terminal 1 - Backend
|
||||
cd admin-panel && npm run dev # Terminal 2 - Frontend
|
||||
cd backend && python3 main.py # Terminal 1 - Backend (port 8001)
|
||||
cd admin-panel && npm run dev # Terminal 2 - Frontend (port 5173)
|
||||
|
||||
# Dependency kurulumu
|
||||
cd backend && pip install -r requirements.txt # Python
|
||||
cd admin-panel && npm install # Node.js
|
||||
|
||||
# Diğer komutlar
|
||||
cd admin-panel && npm run build # Frontend build
|
||||
cd admin-panel && npm run lint # Lint
|
||||
cd backend && python3 -c "from database import init_db; init_db()" # DB init
|
||||
# Build & lint
|
||||
cd admin-panel && npm run build # Frontend production build
|
||||
cd admin-panel && npm run lint # ESLint
|
||||
|
||||
# Backend testler
|
||||
cd backend && pytest # Tüm testler
|
||||
cd backend && pytest tests/test_cache.py # Tek test dosyası
|
||||
cd backend && pytest tests/test_cache.py -k "test_ttl" # Tek test
|
||||
|
||||
# Frontend E2E testler (Playwright)
|
||||
cd admin-panel && npx playwright test # Tüm E2E testler
|
||||
cd admin-panel && npx playwright test tests/rare-keywords.spec.js # Tek spec
|
||||
|
||||
# Docker ile çalıştırma
|
||||
./build-docker.sh && ./start-docker.sh # Build + start
|
||||
./stop-docker.sh # Durdur
|
||||
|
||||
# DB migration
|
||||
cd backend && alembic upgrade head # Migration uygula
|
||||
cd backend && alembic revision --autogenerate -m "description" # Yeni migration
|
||||
```
|
||||
|
||||
**Erişim URL'leri**:
|
||||
@@ -39,23 +55,36 @@ cd backend && python3 -c "from database import init_db; init_db()" # DB init
|
||||
|
||||
### 3 Katmanlı Yapı
|
||||
```
|
||||
React Frontend (admin-panel/) → FastAPI Backend (backend/) → SQLite + JSON
|
||||
├── CategoryManagement.jsx ├── main.py (~4400 satır) ├── trendyol.db
|
||||
├── ReportGeneration.jsx ├── database.py ├── categories/*.json
|
||||
├── ReportList.jsx └── scraper.py └── reports/*.json
|
||||
└── ReportDashboard.jsx (7 tab)
|
||||
React Frontend (admin-panel/) → FastAPI Backend (backend/) → PostgreSQL + JSON
|
||||
├── ReportDashboard.jsx (9 tab) ├── main.py (~5000 satır) ├── trendyol_db
|
||||
├── ReportGeneration.jsx ├── database.py (ORM) ├── categories/*.json
|
||||
├── ReportList.jsx ├── scraper.py └── reports/*.json
|
||||
├── ReportComparison.jsx ├── google_trends_helper.py
|
||||
└── CategoryManagement.jsx └── analytics/
|
||||
├── metrics.py
|
||||
└── champion_finder.py
|
||||
```
|
||||
|
||||
### Dashboard Tab'ları (7 adet)
|
||||
### Frontend Routes
|
||||
| Path | Component | Açıklama |
|
||||
|------|-----------|----------|
|
||||
| `/` veya `/report` | ReportGeneration | Yeni rapor oluştur |
|
||||
| `/reports` | ReportList | Kayıtlı raporlar |
|
||||
| `/reports/:reportId` | ReportDashboard | 9 tab'lı analiz dashboard |
|
||||
| `/compare` | ReportComparison | Yan yana rapor karşılaştırma |
|
||||
|
||||
### Dashboard Tab'ları (9 adet)
|
||||
| Tab ID | Tab Adı | Component | Açıklama |
|
||||
|--------|---------|-----------|----------|
|
||||
| overview | Genel Bakış | OverviewTab | KPI'lar, özet grafikler |
|
||||
| brand | Marka | BrandTab | Marka analizi, pazar payı |
|
||||
| category | Kategori | CategoryTab | Kategori dağılımı |
|
||||
| origin | Menşei | OriginTab | Ülke bazlı analiz |
|
||||
| barcode | Barkod | BarcodeTab | Barkod veri analizi |
|
||||
| keyword | Keyword Aracı | KeywordTab | Anahtar kelime analizi |
|
||||
| barcode | Barkod | BarcodeTab | Barkod/GS1 menşei analizi |
|
||||
| keyword | Keyword Aracı | KeywordTab | Anahtar kelime + Google Trends |
|
||||
| product-finder | Ürün Bulma | ProductFinderTab | Ürün arama/filtreleme |
|
||||
| hidden-champions | Gizli Şampiyonlar | HiddenChampionsTab | Düşük yorum, yüksek puan fırsatları |
|
||||
| opportunity | Fırsat Analizi | OpportunityTab | Pazar fırsat analizi |
|
||||
|
||||
### Veri Akışı
|
||||
|
||||
@@ -77,12 +106,12 @@ React Frontend (admin-panel/) → FastAPI Backend (backend/) → SQLite +
|
||||
**Backend'den gelen hazır objeleri kullan, ham hesaplama YAPMA:**
|
||||
|
||||
```jsx
|
||||
// ✅ DOĞRU - Hazır veriyi kullan
|
||||
// DOĞRU - Hazır veriyi kullan
|
||||
const kpis = dashboardData?.kpis || {};
|
||||
const topProducts = dashboardData?.charts?.top_products || [];
|
||||
const topBrands = dashboardData?.charts?.top_brands || [];
|
||||
|
||||
// ❌ YANLIŞ - all_products'tan hesaplama yapma
|
||||
// YANLIŞ - all_products'tan hesaplama yapma
|
||||
const total = dashboardData?.all_products.reduce((sum, p) => sum + p.price, 0);
|
||||
```
|
||||
|
||||
@@ -97,12 +126,11 @@ Frontend hesaplamalı veri, alan adı uyumsuzluğuna yol açabilir. Detay için:
|
||||
|
||||
**Çözüm Pattern - Mapping Layer**:
|
||||
```jsx
|
||||
// Veriyi component beklentilerine dönüştür
|
||||
const transformed = sourceData.map(item => ({
|
||||
country: item.name, // Beklenen alana map'le
|
||||
name: item.name, // Orijinali koru
|
||||
count: item.productCount, // Beklenen alana map'le
|
||||
productCount: item.productCount // Orijinali koru
|
||||
country: item.name,
|
||||
name: item.name,
|
||||
count: item.productCount,
|
||||
productCount: item.productCount
|
||||
}));
|
||||
```
|
||||
|
||||
@@ -111,7 +139,7 @@ const transformed = sourceData.map(item => ({
|
||||
1. Tab config'i `src/constants/tabGroups.js`'e ekle
|
||||
2. Tab component'ini `src/components/dashboard-tabs/` altına oluştur
|
||||
3. `ReportDashboard.jsx`'te import et ve render bloğu ekle
|
||||
4. **Her zaman veri dönüşümü için console.log ekle**
|
||||
4. Gerekiyorsa backend'e yeni endpoint ekle (`main.py`)
|
||||
|
||||
## API Entegrasyonu
|
||||
|
||||
@@ -123,15 +151,10 @@ const transformed = sourceData.map(item => ({
|
||||
| ENRICHMENT | 120s | Sosyal kanıt zenginleştirme |
|
||||
| KEYWORD_ANALYSIS | 300s | Keyword analizi |
|
||||
|
||||
### Polling Pattern
|
||||
```jsx
|
||||
// Exponential backoff with jitter (1s → 5s max)
|
||||
import { fetchWithTimeout, API_BASE_URL } from '../config/api';
|
||||
```
|
||||
|
||||
### Rate Limit
|
||||
- Sosyal kanıt API: 2 istek/saniye
|
||||
- Exponential backoff kullanılır (%75 istek azaltımı sağlandı)
|
||||
### Rate Limit & Resilience
|
||||
- Sosyal kanıt API: 2 istek/saniye (RateLimiter)
|
||||
- Circuit breaker pattern for external API calls
|
||||
- Exponential backoff with jitter (1s → 5s max)
|
||||
|
||||
## Kod Değişiklik Kuralları
|
||||
|
||||
@@ -141,18 +164,45 @@ import { fetchWithTimeout, API_BASE_URL } from '../config/api';
|
||||
- Uzun işlemler: BackgroundTasks + progress polling endpoint
|
||||
- Harici API çağrıları: Her zaman timeout parametresi ekle
|
||||
- Cache: BoundedCache kullan (asla sınırsız dict kullanma)
|
||||
- Analytics hesaplamaları: `analytics/` modülüne koy (metrics.py, champion_finder.py)
|
||||
|
||||
### Frontend
|
||||
- `fetchWithTimeout` kullan (`src/config/api.js`'den)
|
||||
- Async işlemler için loading state göster
|
||||
- Eşzamanlı çağrılar için request deduplication uygula
|
||||
- Grafikler: Recharts kullan, veri dönüşümü `utils/chartTransformers.js`'de
|
||||
- Export: `utils/exportUtils.js` ile CSV/Excel
|
||||
|
||||
### CORS Değişiklikleri
|
||||
Yeni frontend portları için `main.py`'deki CORS allowlist'e ekle (satır 34-45):
|
||||
Yeni frontend portları için `main.py`'deki CORS allowlist'e ekle:
|
||||
```python
|
||||
allow_origins=["http://localhost:5173", "http://localhost:5174", ...]
|
||||
```
|
||||
|
||||
## Database
|
||||
|
||||
**Dev**: `postgresql://postgres:trendyol123@localhost:5433/trendyol_db`
|
||||
**Docker**: `postgresql://postgres:trendyol123@postgres:5432/trendyol_db`
|
||||
|
||||
Migrations: Alembic (`backend/alembic/`). Her schema değişikliğinde `alembic revision --autogenerate` çalıştır.
|
||||
|
||||
| Model | Amaç | Anahtar Alanlar |
|
||||
|-------|------|-----------------|
|
||||
| Category | Hiyerarşik kategori ağacı | `parent_id` (self-ref), `trendyol_category_id` |
|
||||
| Snapshot | Aylık veri görüntüleri | `category_id`, `json_file_path` |
|
||||
| Report | Kayıtlı raporlar | `category_id`, `json_file_path` |
|
||||
| EnrichmentError | API hata logları | `endpoint`, `error_type`, `status_code` |
|
||||
|
||||
## Deployment
|
||||
|
||||
**Platform**: Coolify + Docker Compose + Traefik reverse proxy
|
||||
|
||||
Docker Compose servisleri: `postgres` (15-alpine), `backend` (FastAPI), `frontend` (Nginx)
|
||||
|
||||
`startup.sh` sırası: PostgreSQL bağlantı bekle → Alembic migration → Kategori seeding → Uvicorn başlat
|
||||
|
||||
Traefik SSE streaming desteği: 100ms flush interval (rapor progress için)
|
||||
|
||||
## Kaynak Limitleri
|
||||
|
||||
| Kaynak | Limit |
|
||||
@@ -163,26 +213,11 @@ allow_origins=["http://localhost:5173", "http://localhost:5174", ...]
|
||||
| Sosyal kanıt batch | 5 ürün/istek |
|
||||
| Rate limit | 2 istek/saniye (sosyal kanıt) |
|
||||
|
||||
## Kritik Dependency'ler
|
||||
|
||||
**Backend**: FastAPI 0.104.1, SQLAlchemy 2.0.45, Uvicorn 0.24.0, Requests 2.31.0, Pytrends 4.9.2
|
||||
|
||||
**Frontend**: React 19.2.0, Vite 7.2.2, Recharts 3.4.1, Tailwind CSS 4.1.17, Axios 1.13.2
|
||||
|
||||
## Database Modelleri
|
||||
|
||||
| Model | Amaç | Anahtar Alanlar |
|
||||
|-------|------|-----------------|
|
||||
| Category | Hiyerarşik kategori ağacı | `parent_id` (self-ref), `trendyol_category_id` |
|
||||
| Snapshot | Aylık veri görüntüleri | `category_id`, `json_file_path` |
|
||||
| Report | Kayıtlı raporlar | `category_id`, `json_file_path` |
|
||||
| EnrichmentError | API hata logları | `endpoint`, `error_type`, `status_code` |
|
||||
|
||||
## Dokümantasyon
|
||||
|
||||
| Dosya | Amaç |
|
||||
|-------|------|
|
||||
| docs/DASHBOARD_ARCHITECTURE.md | **Önemli** - Dashboard veri yapıları |
|
||||
| docs/DASHBOARD_ARCHITECTURE.md | Dashboard veri yapıları ve KPI tanımları |
|
||||
| docs/bug-fixes/ORIGINTAB_BUG_FIX.md | **Kritik** - Alan adı uyumsuzluk pattern'i |
|
||||
| docs/API_DOCUMENTATION.md | Tam API referansı |
|
||||
| docs/ARCHITECTURE.md | Sistem mimarisi (Türkçe) |
|
||||
|
||||
@@ -99,17 +99,27 @@ function ReportDashboard() {
|
||||
|
||||
const products = dashboardData.all_products
|
||||
const totalProducts = products.length
|
||||
const totalOrders = products.reduce((sum, p) => sum + (p.orders || 0), 0)
|
||||
const rawOrders = products.reduce((sum, p) => sum + (p.orders || 0), 0)
|
||||
const totalBaskets = products.reduce((sum, p) => sum + (p.baskets || 0), 0)
|
||||
// Trendyol API artık order-count döndürmüyor — orders > 0 ise onu, yoksa baskets'ı kullan
|
||||
const totalOrders = rawOrders > 0 ? rawOrders : totalBaskets
|
||||
const ordersLabel = rawOrders > 0 ? 'orders' : 'baskets'
|
||||
const totalViews = products.reduce((sum, p) => sum + (p.page_views || 0), 0)
|
||||
const totalFavorites = products.reduce((sum, p) => sum + (p.favorites || 0), 0)
|
||||
const avgPrice = products.reduce((sum, p) => sum + (p.price || 0), 0) / totalProducts
|
||||
const totalRevenue = products.reduce((sum, p) => sum + ((p.price || 0) * (p.orders || 0)), 0)
|
||||
const totalRevenue = rawOrders > 0
|
||||
? products.reduce((sum, p) => sum + ((p.price || 0) * (p.orders || 0)), 0)
|
||||
: products.reduce((sum, p) => sum + ((p.price || 0) * (p.baskets || 0)), 0)
|
||||
|
||||
const kpis = {
|
||||
totalProducts,
|
||||
totalOrders,
|
||||
totalBaskets,
|
||||
totalViews,
|
||||
totalFavorites,
|
||||
avgPrice: Math.round(avgPrice),
|
||||
totalRevenue: Math.round(totalRevenue)
|
||||
totalRevenue: Math.round(totalRevenue),
|
||||
ordersLabel
|
||||
}
|
||||
|
||||
console.log('✅ [KPI] Calculated KPIs:', kpis)
|
||||
|
||||
@@ -12,8 +12,8 @@ export default function HiddenChampionsTab({ reportId }) {
|
||||
// Filters
|
||||
const [minRating, setMinRating] = useState(4.0)
|
||||
const [maxReview, setMaxReview] = useState(100)
|
||||
const [minOrders, setMinOrders] = useState(5)
|
||||
const [sortKey, setSortKey] = useState('performance_score')
|
||||
const [minOrders, setMinOrders] = useState(0)
|
||||
const [sortKey, setSortKey] = useState('hidden_champion_score')
|
||||
const [sortDir, setSortDir] = useState('desc')
|
||||
const [showFilters, setShowFilters] = useState(false)
|
||||
|
||||
@@ -41,9 +41,9 @@ export default function HiddenChampionsTab({ reportId }) {
|
||||
|
||||
// Filtered & sorted products
|
||||
const filteredProducts = useMemo(() => {
|
||||
if (!data?.products) return []
|
||||
if (!data?.hidden_champions) return []
|
||||
|
||||
return data.products
|
||||
return data.hidden_champions
|
||||
.filter(p => {
|
||||
const rating = p.rating || 0
|
||||
const reviewCount = p.review_count || p.reviewCount || 0
|
||||
@@ -230,10 +230,10 @@ export default function HiddenChampionsTab({ reportId }) {
|
||||
</th>
|
||||
<th
|
||||
className="text-right px-4 py-3 font-medium text-slate-500 cursor-pointer hover:text-slate-700"
|
||||
onClick={() => handleSort('performance_score')}
|
||||
onClick={() => handleSort('hidden_champion_score')}
|
||||
>
|
||||
<div className="flex items-center justify-end gap-1">
|
||||
Skor <SortIcon column="performance_score" />
|
||||
Skor <SortIcon column="hidden_champion_score" />
|
||||
</div>
|
||||
</th>
|
||||
</tr>
|
||||
@@ -287,13 +287,13 @@ export default function HiddenChampionsTab({ reportId }) {
|
||||
</td>
|
||||
<td className="px-4 py-3 text-right">
|
||||
<span className={`inline-flex items-center px-2 py-0.5 rounded-full text-xs font-bold ${
|
||||
(product.performance_score || 0) >= 70
|
||||
(product.hidden_champion_score || 0) >= 70
|
||||
? 'bg-emerald-100 text-emerald-700'
|
||||
: (product.performance_score || 0) >= 40
|
||||
: (product.hidden_champion_score || 0) >= 40
|
||||
? 'bg-amber-100 text-amber-700'
|
||||
: 'bg-slate-100 text-slate-600'
|
||||
}`}>
|
||||
{(product.performance_score || 0).toFixed(0)}
|
||||
{(product.hidden_champion_score || 0).toFixed(0)}
|
||||
</span>
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
@@ -90,21 +90,21 @@ export default function OverviewTab({
|
||||
? (sortedPrices[sortedPrices.length / 2 - 1] + sortedPrices[sortedPrices.length / 2]) / 2
|
||||
: sortedPrices[Math.floor(sortedPrices.length / 2)]
|
||||
|
||||
const bucketCount = 10
|
||||
const range = max - min || 1
|
||||
const bucketSize = range / bucketCount
|
||||
// Use predefined price ranges for meaningful distribution
|
||||
const ranges = [
|
||||
[0, 50], [50, 100], [100, 200], [200, 500],
|
||||
[500, 1000], [1000, 2000], [2000, 5000], [5000, 10000], [10000, Infinity]
|
||||
]
|
||||
|
||||
const buckets = Array.from({ length: bucketCount }, (_, i) => ({
|
||||
range: `₺${Math.round(min + i * bucketSize)}-${Math.round(min + (i + 1) * bucketSize)}`,
|
||||
min: min + i * bucketSize,
|
||||
max: min + (i + 1) * bucketSize,
|
||||
count: 0
|
||||
// Filter out empty ranges and build buckets
|
||||
const buckets = ranges
|
||||
.map(([lo, hi]) => ({
|
||||
range: hi === Infinity ? `₺${lo.toLocaleString('tr-TR')}+` : `₺${lo.toLocaleString('tr-TR')}-${hi.toLocaleString('tr-TR')}`,
|
||||
min: lo,
|
||||
max: hi,
|
||||
count: prices.filter(p => p >= lo && (hi === Infinity ? true : p < hi)).length
|
||||
}))
|
||||
|
||||
prices.forEach(price => {
|
||||
const idx = Math.min(Math.floor((price - min) / bucketSize), bucketCount - 1)
|
||||
buckets[idx].count++
|
||||
})
|
||||
.filter(b => b.count > 0)
|
||||
|
||||
return { buckets, mean: Math.round(mean), median: Math.round(median) }
|
||||
}, [allProducts])
|
||||
@@ -186,7 +186,7 @@ export default function OverviewTab({
|
||||
color="blue"
|
||||
/>
|
||||
<KpiCard
|
||||
title="Toplam Satın Alma"
|
||||
title={overviewKPIs.ordersLabel === 'baskets' ? 'Toplam Sepete Ekleme' : 'Toplam Satın Alma'}
|
||||
value={overviewKPIs.totalOrders.toLocaleString('tr-TR')}
|
||||
icon={ShoppingCart}
|
||||
color="emerald"
|
||||
@@ -198,7 +198,7 @@ export default function OverviewTab({
|
||||
color="violet"
|
||||
/>
|
||||
<KpiCard
|
||||
title="Toplam Ciro"
|
||||
title={overviewKPIs.ordersLabel === 'baskets' ? 'Tahmini Ciro (Sepet)' : 'Toplam Ciro'}
|
||||
value={`₺${(overviewKPIs.totalRevenue || 0).toLocaleString('tr-TR')}`}
|
||||
icon={DollarSign}
|
||||
color="orange"
|
||||
@@ -359,10 +359,10 @@ export default function OverviewTab({
|
||||
contentStyle={{ borderRadius: '8px', border: '1px solid #e2e8f0' }}
|
||||
/>
|
||||
<ReferenceLine
|
||||
x={priceDistribution.buckets.findIndex(b => b.min <= priceDistribution.mean && b.max > priceDistribution.mean)}
|
||||
x={(priceDistribution.buckets.find(b => b.min <= priceDistribution.mean && (b.max === Infinity || b.max > priceDistribution.mean)) || {}).range}
|
||||
stroke="#f97316"
|
||||
strokeDasharray="5 5"
|
||||
label={{ value: `Ort: ₺${priceDistribution.mean}`, fill: '#f97316', fontSize: 11, position: 'top' }}
|
||||
label={{ value: `Ort: ₺${priceDistribution.mean.toLocaleString('tr-TR')}`, fill: '#f97316', fontSize: 11, position: 'top' }}
|
||||
/>
|
||||
<Bar dataKey="count" fill="#6366f1" radius={[4, 4, 0, 0]} label={{ position: 'top', fill: '#64748b', fontSize: 11 }} />
|
||||
</BarChart>
|
||||
|
||||
@@ -30,7 +30,7 @@ COPY backend/ .
|
||||
COPY categories/ /data/initial-categories/
|
||||
|
||||
# Create data directories with proper permissions
|
||||
RUN mkdir -p /data/categories /data/reports && \
|
||||
RUN mkdir -p /data/categories /data/reports /data/logs && \
|
||||
chmod -R 755 /data
|
||||
|
||||
# Make startup script executable (before switching to non-root user)
|
||||
|
||||
@@ -0,0 +1,30 @@
|
||||
"""add path_model to categories
|
||||
|
||||
Revision ID: 38207dbbac44
|
||||
Revises: 001
|
||||
Create Date: 2026-03-28 14:56:06.784769
|
||||
|
||||
"""
|
||||
from typing import Sequence, Union
|
||||
|
||||
from alembic import op
|
||||
import sqlalchemy as sa
|
||||
|
||||
|
||||
# revision identifiers, used by Alembic.
|
||||
revision: str = '38207dbbac44'
|
||||
down_revision: Union[str, None] = '001'
|
||||
branch_labels: Union[str, Sequence[str], None] = None
|
||||
depends_on: Union[str, Sequence[str], None] = None
|
||||
|
||||
|
||||
def upgrade() -> None:
|
||||
# ### commands auto generated by Alembic - please adjust! ###
|
||||
op.add_column('categories', sa.Column('path_model', sa.String(), nullable=True))
|
||||
# ### end Alembic commands ###
|
||||
|
||||
|
||||
def downgrade() -> None:
|
||||
# ### commands auto generated by Alembic - please adjust! ###
|
||||
op.drop_column('categories', 'path_model')
|
||||
# ### end Alembic commands ###
|
||||
@@ -17,6 +17,51 @@ class HiddenChampionFinder:
|
||||
Parçalı pazarlarda (düşük HHI) özelleştirilmiş filtreler kullanır
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def _parse_social_proof_value(value_str: str) -> int:
|
||||
"""Parse '3k', '248k', '1.2k', '866' gibi değerleri sayıya çevir"""
|
||||
if not value_str:
|
||||
return 0
|
||||
value_str = str(value_str).strip().lower().replace(".", "")
|
||||
if value_str.endswith("k"):
|
||||
try:
|
||||
return int(float(value_str[:-1]) * 1000)
|
||||
except (ValueError, TypeError):
|
||||
return 0
|
||||
if value_str.endswith("m"):
|
||||
try:
|
||||
return int(float(value_str[:-1]) * 1000000)
|
||||
except (ValueError, TypeError):
|
||||
return 0
|
||||
try:
|
||||
return int(value_str)
|
||||
except (ValueError, TypeError):
|
||||
return 0
|
||||
|
||||
@staticmethod
|
||||
def _extract_social_proofs(product: Dict) -> Dict[str, int]:
|
||||
"""Ürünün socialProofs array'inden veri çıkar"""
|
||||
result = {"page_views": 0, "orders": 0, "baskets": 0, "favorites": 0}
|
||||
social_proofs = product.get("socialProofs", [])
|
||||
if not social_proofs:
|
||||
return result
|
||||
type_map = {
|
||||
"pageViewCount": "page_views",
|
||||
"orderCountL3D": "orders",
|
||||
"orderCountL365D": "orders",
|
||||
"basketCount": "baskets",
|
||||
"favoriteCount": "favorites",
|
||||
}
|
||||
for sp in social_proofs:
|
||||
sp_type = sp.get("type", "")
|
||||
mapped = type_map.get(sp_type)
|
||||
if mapped:
|
||||
val = HiddenChampionFinder._parse_social_proof_value(sp.get("value", "0"))
|
||||
# Daha büyük değeri al (orderCountL3D vs orderCountL365D)
|
||||
if val > result[mapped]:
|
||||
result[mapped] = val
|
||||
return result
|
||||
|
||||
def find(
|
||||
self,
|
||||
products: List[Dict],
|
||||
@@ -98,10 +143,12 @@ class HiddenChampionFinder:
|
||||
pid = str(product.get("id"))
|
||||
social = social_details.get(pid, {})
|
||||
|
||||
page_views = social.get("page_views", 0) or 0
|
||||
orders = social.get("orders", 0) or 0
|
||||
baskets = social.get("baskets", 0) or 0
|
||||
favorites = social.get("favorites", 0) or 0
|
||||
# Önce enriched social data, sonra ürünün kendi socialProofs'u
|
||||
embedded_social = self._extract_social_proofs(product)
|
||||
page_views = social.get("page_views", 0) or embedded_social["page_views"] or 0
|
||||
orders = social.get("orders", 0) or embedded_social["orders"] or product.get("orders", 0) or 0
|
||||
baskets = social.get("baskets", 0) or embedded_social["baskets"] or 0
|
||||
favorites = social.get("favorites", 0) or embedded_social["favorites"] or 0
|
||||
|
||||
conversion_rate = (orders / page_views * 100) if page_views > 0 else 0
|
||||
|
||||
@@ -139,15 +186,28 @@ class HiddenChampionFinder:
|
||||
# Minimum Orders kontrolü (satış verisi çok önemli)
|
||||
min_orders = filters.get("min_orders", 1) # Varsayılan: en az 1 satış
|
||||
|
||||
# Sosyal veri var mı kontrol et
|
||||
has_social = pid in social_details and page_views > 0
|
||||
|
||||
# Özelleştirilmiş Filtreleme (daha esnek)
|
||||
if has_social:
|
||||
# Sosyal verisi olan ürünler: tam filtre
|
||||
passes_filter = (
|
||||
rating >= filters.get("min_rating", 4.6) and
|
||||
review_count < filters.get("max_review_count", 30) and
|
||||
review_count >= 1 and # En az 1 yorum olmalı
|
||||
orders >= min_orders and # EN AZ 1 SATIŞ OLMALI (satış verisi çok önemli)
|
||||
(page_views >= threshold_views or page_views >= min_views_threshold) and # Kategori ortalamasının üzerinde VEYA minimum threshold
|
||||
(baskets >= threshold_baskets or baskets >= min_baskets_threshold) and # Sepet de kategori ortalamasının üzerinde VEYA minimum
|
||||
(conversion_rate >= 1.0 or page_views >= 500) # Minimum %1 conversion VEYA yüksek görüntülenme
|
||||
review_count >= 1 and
|
||||
orders >= min_orders and
|
||||
(page_views >= threshold_views or page_views >= min_views_threshold) and
|
||||
(baskets >= threshold_baskets or baskets >= min_baskets_threshold) and
|
||||
(conversion_rate >= 1.0 or page_views >= 500)
|
||||
)
|
||||
else:
|
||||
# Sosyal verisi olmayan ürünler: sadece rating + review + orders filtresi
|
||||
passes_filter = (
|
||||
rating >= filters.get("min_rating", 4.6) and
|
||||
review_count < filters.get("max_review_count", 30) and
|
||||
review_count >= 1 and
|
||||
orders >= min_orders
|
||||
)
|
||||
|
||||
if passes_filter:
|
||||
@@ -196,7 +256,7 @@ class HiddenChampionFinder:
|
||||
"category": category_name,
|
||||
"rating": round(rating, 2),
|
||||
"review_count": review_count,
|
||||
"price": product.get("price", {}).get("sellingPrice", 0),
|
||||
"price": (product.get("price", {}).get("sellingPrice", 0) or product.get("price", {}).get("discountedPrice", 0) or product.get("price", {}).get("current", 0)) if isinstance(product.get("price"), dict) else (product.get("price", 0) or 0),
|
||||
"page_views": page_views,
|
||||
"orders": orders,
|
||||
"baskets": baskets,
|
||||
|
||||
@@ -245,7 +245,13 @@ def get_rating_value(product: Dict) -> float:
|
||||
rating = product.get("rating", 0)
|
||||
if isinstance(rating, dict):
|
||||
return rating.get("averageRating", 0) or 0
|
||||
return float(rating) if rating else 0
|
||||
if rating:
|
||||
return float(rating)
|
||||
# Fallback: ratingScore nested object
|
||||
rating_score = product.get("ratingScore", {})
|
||||
if isinstance(rating_score, dict):
|
||||
return float(rating_score.get("averageRating", 0) or 0)
|
||||
return 0
|
||||
|
||||
|
||||
def get_review_count(product: Dict) -> int:
|
||||
@@ -263,6 +269,11 @@ def get_review_count(product: Dict) -> int:
|
||||
rating = product.get("rating", {})
|
||||
if isinstance(rating, dict):
|
||||
review_count = rating.get("totalComments", 0) or rating.get("totalCount", 0) or 0
|
||||
if not review_count:
|
||||
# Fallback: ratingScore nested object
|
||||
rating_score = product.get("ratingScore", {})
|
||||
if isinstance(rating_score, dict):
|
||||
review_count = rating_score.get("totalCount", 0) or 0
|
||||
return int(review_count) if review_count else 0
|
||||
|
||||
|
||||
|
||||
143
backend/category_seeder.py
Normal file
143
backend/category_seeder.py
Normal file
@@ -0,0 +1,143 @@
|
||||
"""
|
||||
Category Seeder - Trendyol categories JSON'dan DB'ye aktarma
|
||||
Kaynak: /Users/furkanyigit/Desktop/trendyol_categories.json
|
||||
3 seviye hiyerarşi: Segment (Kadın) → Grup (Giyim) → Yaprak (Elbise)
|
||||
"""
|
||||
import json
|
||||
import re
|
||||
import os
|
||||
from database import SessionLocal, Category, Snapshot, Report, EnrichmentError
|
||||
from logging_config import get_logger
|
||||
|
||||
log = get_logger("seeder")
|
||||
|
||||
DEFAULT_JSON_PATH = os.path.expanduser("~/Desktop/trendyol_categories.json")
|
||||
|
||||
|
||||
def parse_url(url: str) -> dict:
|
||||
"""URL'den path_model ve trendyol_category_id çıkar.
|
||||
|
||||
Örnekler:
|
||||
/elbise-x-c56 → path_model="elbise-x-c56", category_id=56
|
||||
/kanvas-canta-y-s20972 → path_model="kanvas-canta-y-s20972", category_id=None
|
||||
/kadin-giyim-x-g1-c82 → path_model="kadin-giyim-x-g1-c82", category_id=82
|
||||
"""
|
||||
# Strip leading slash
|
||||
path_model = url.lstrip("/")
|
||||
|
||||
# Try to extract -c{id} from the end
|
||||
m = re.search(r"-c(\d+)$", path_model)
|
||||
category_id = int(m.group(1)) if m else None
|
||||
|
||||
return {
|
||||
"path_model": path_model,
|
||||
"trendyol_category_id": category_id,
|
||||
}
|
||||
|
||||
|
||||
def seed_from_json(json_path: str = None, clear_existing: bool = True) -> dict:
|
||||
"""JSON dosyasını okuyup DB'ye yazar.
|
||||
|
||||
Returns:
|
||||
{"segments": int, "groups": int, "leaves": int, "total": int}
|
||||
"""
|
||||
json_path = json_path or DEFAULT_JSON_PATH
|
||||
|
||||
with open(json_path, "r", encoding="utf-8") as f:
|
||||
data = json.load(f)
|
||||
|
||||
db = SessionLocal()
|
||||
try:
|
||||
if clear_existing:
|
||||
# FK constraint nedeniyle referans veren tabloları önce temizle
|
||||
db.query(EnrichmentError).delete(synchronize_session=False)
|
||||
db.query(Report).delete(synchronize_session=False)
|
||||
db.query(Snapshot).delete(synchronize_session=False)
|
||||
db.query(Category).filter(Category.parent_id != None).delete(synchronize_session=False) # noqa: E711
|
||||
db.query(Category).delete(synchronize_session=False)
|
||||
db.commit()
|
||||
log.info("Mevcut kategoriler ve bağlı veriler silindi")
|
||||
|
||||
stats = {"segments": 0, "groups": 0, "leaves": 0, "total": 0}
|
||||
|
||||
for segment_name, groups in data.items():
|
||||
# Seviye 1: Segment (Kadın, Erkek, ...)
|
||||
segment = Category(
|
||||
name=segment_name,
|
||||
parent_id=None,
|
||||
trendyol_category_id=None,
|
||||
trendyol_url=None,
|
||||
path_model=None,
|
||||
is_active=True,
|
||||
)
|
||||
db.add(segment)
|
||||
db.flush() # ID'yi al
|
||||
stats["segments"] += 1
|
||||
stats["total"] += 1
|
||||
|
||||
for group_item in groups:
|
||||
group_name = group_item["name"]
|
||||
group_url = group_item.get("url", "")
|
||||
group_parsed = parse_url(group_url) if group_url else {"path_model": None, "trendyol_category_id": None}
|
||||
|
||||
children = group_item.get("children", [])
|
||||
|
||||
if children:
|
||||
# Seviye 2: Grup (Giyim, Ayakkabı, ...)
|
||||
group = Category(
|
||||
name=group_name,
|
||||
parent_id=segment.id,
|
||||
trendyol_category_id=group_parsed["trendyol_category_id"],
|
||||
trendyol_url=f"https://www.trendyol.com{group_url}" if group_url else None,
|
||||
path_model=group_parsed["path_model"],
|
||||
is_active=True,
|
||||
)
|
||||
db.add(group)
|
||||
db.flush()
|
||||
stats["groups"] += 1
|
||||
stats["total"] += 1
|
||||
|
||||
for leaf_item in children:
|
||||
leaf_url = leaf_item.get("url", "")
|
||||
leaf_parsed = parse_url(leaf_url) if leaf_url else {"path_model": None, "trendyol_category_id": None}
|
||||
|
||||
leaf = Category(
|
||||
name=leaf_item["name"],
|
||||
parent_id=group.id,
|
||||
trendyol_category_id=leaf_parsed["trendyol_category_id"],
|
||||
trendyol_url=f"https://www.trendyol.com{leaf_url}" if leaf_url else None,
|
||||
path_model=leaf_parsed["path_model"],
|
||||
is_active=True,
|
||||
)
|
||||
db.add(leaf)
|
||||
stats["leaves"] += 1
|
||||
stats["total"] += 1
|
||||
else:
|
||||
# Çocuğu yok — bu grup aslında yaprak
|
||||
leaf = Category(
|
||||
name=group_name,
|
||||
parent_id=segment.id,
|
||||
trendyol_category_id=group_parsed["trendyol_category_id"],
|
||||
trendyol_url=f"https://www.trendyol.com{group_url}" if group_url else None,
|
||||
path_model=group_parsed["path_model"],
|
||||
is_active=True,
|
||||
)
|
||||
db.add(leaf)
|
||||
stats["leaves"] += 1
|
||||
stats["total"] += 1
|
||||
|
||||
db.commit()
|
||||
log.info(f"Seed tamamlandı: {stats}")
|
||||
return stats
|
||||
|
||||
except Exception as e:
|
||||
db.rollback()
|
||||
log.error(f"Seed hatası: {e}")
|
||||
raise
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
result = seed_from_json()
|
||||
print(f"Seed tamamlandı: {result}")
|
||||
791
backend/data_consolidator.py
Normal file
791
backend/data_consolidator.py
Normal file
@@ -0,0 +1,791 @@
|
||||
"""
|
||||
Data Consolidator — tek birleştirilmiş JSON oluşturma modülü.
|
||||
|
||||
Scraping + enrichment bittiğinde tüm normalizasyon ve hesaplamayı yapar,
|
||||
sonucu reports/report_{id}_data.json olarak kaydeder.
|
||||
Dashboard endpoint sadece bu dosyayı okur.
|
||||
"""
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import time
|
||||
import random
|
||||
from collections import defaultdict
|
||||
from datetime import datetime
|
||||
|
||||
import numpy as np
|
||||
|
||||
from logging_config import get_logger
|
||||
|
||||
log = get_logger("consolidator")
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# Ülke kodu → tam isim mapping (menşei analizi için)
|
||||
# ─────────────────────────────────────────────────────────
|
||||
COUNTRY_NAMES = {
|
||||
"TR": "Türkiye", "CN": "Çin", "US": "Amerika", "GB": "İngiltere",
|
||||
"FR": "Fransa", "DE": "Almanya", "IT": "İtalya", "ES": "İspanya",
|
||||
"KR": "Güney Kore", "JP": "Japonya", "IN": "Hindistan", "TW": "Tayvan",
|
||||
"HK": "Hong Kong", "TH": "Tayland", "VN": "Vietnam", "PL": "Polonya",
|
||||
"CZ": "Çek Cumhuriyeti", "RO": "Romanya", "BG": "Bulgaristan",
|
||||
"GR": "Yunanistan", "PT": "Portekiz", "NL": "Hollanda", "BE": "Belçika",
|
||||
"CH": "İsviçre", "AT": "Avusturya", "SE": "İsveç", "NO": "Norveç",
|
||||
"DK": "Danimarka", "FI": "Finlandiya", "RU": "Rusya", "UA": "Ukrayna",
|
||||
"AE": "Birleşik Arap Emirlikleri", "SA": "Suudi Arabistan", "IL": "İsrail",
|
||||
"EG": "Mısır", "ZA": "Güney Afrika", "BR": "Brezilya", "MX": "Meksika",
|
||||
"CA": "Kanada", "AU": "Avustralya", "NZ": "Yeni Zelanda", "SG": "Singapur",
|
||||
"MY": "Malezya", "ID": "Endonezya", "PH": "Filipinler", "PK": "Pakistan",
|
||||
"BD": "Bangladeş", "AZ": "Azerbaycan",
|
||||
}
|
||||
|
||||
# Barkod prefix → ülke (EAN-13)
|
||||
BARCODE_COUNTRIES = {
|
||||
"TYB": "Trendyol (İç Barkod)", "SGT": "Trendyol Satıcı",
|
||||
"KPE": "Trendyol Kampanya", "RTN": "Trendyol İade", "CDM": "Trendyol Özel",
|
||||
"00-13": "ABD & Kanada", "190-199": "Rezerve/Özel Kullanım",
|
||||
"20-29": "Mağaza İçi Kullanım", "30-37": "Fransa",
|
||||
"380": "Bulgaristan", "383": "Slovenya", "370": "Litvanya",
|
||||
"372": "Estonya", "373": "Moldova", "375": "Belarus",
|
||||
"377": "Ermenistan", "379": "Kazakistan", "385": "Hırvatistan",
|
||||
"387": "Bosna Hersek", "400-440": "Almanya", "45-49": "Japonya",
|
||||
"50": "İngiltere", "520-521": "Yunanistan", "528": "Lübnan",
|
||||
"529": "Kıbrıs", "530": "Arnavutluk", "531": "Makedonya",
|
||||
"535": "Malta", "539": "İrlanda", "54": "Belçika & Lüksemburg",
|
||||
"560": "Portekiz", "569": "İzlanda", "57": "Danimarka",
|
||||
"590": "Polonya", "594": "Romanya", "599": "Macaristan",
|
||||
"600-601": "Güney Afrika", "603": "Gana", "608": "Bahreyn",
|
||||
"609": "Mauritius", "611": "Fas", "613": "Cezayir",
|
||||
"615": "Nijerya", "616": "Kenya", "618": "Fildişi Sahili",
|
||||
"619": "Tunus", "621": "Suriye", "622": "Mısır",
|
||||
"624": "Libya", "625": "Ürdün", "626": "İran",
|
||||
"627": "Kuveyt", "628": "Suudi Arabistan", "629": "BAE",
|
||||
"630": "Katar", "631": "Umman", "64": "Finlandiya",
|
||||
"690-699": "Çin", "70": "Norveç", "710-719": "Rezerve/Özel Kullanım",
|
||||
"729": "İsrail", "73": "İsveç", "740": "Guatemala",
|
||||
"741": "El Salvador", "742": "Honduras", "743": "Nikaragua",
|
||||
"744": "Kosta Rika", "745": "Panama", "746": "Dominik Cumhuriyeti",
|
||||
"750": "Meksika", "754-755": "Kanada", "759": "Venezuela",
|
||||
"76": "İsviçre", "770-771": "Kolombiya", "773": "Uruguay",
|
||||
"775": "Peru", "777": "Bolivya", "779": "Arjantin",
|
||||
"780": "Şili", "784": "Paraguay", "786": "Ekvador",
|
||||
"789-790": "Brezilya", "80-83": "İtalya", "84": "İspanya",
|
||||
"850": "Küba", "858": "Slovakya", "859": "Çek Cumhuriyeti",
|
||||
"860": "Sırbistan", "865": "Moğolistan", "867": "Kuzey Kore",
|
||||
"868-869": "Türkiye", "87": "Hollanda", "880": "Güney Kore",
|
||||
"884": "Kamboçya", "885": "Tayland", "888": "Singapur",
|
||||
"890": "Hindistan", "893": "Vietnam", "896": "Pakistan",
|
||||
"899": "Endonezya", "90-91": "Avusturya", "93": "Avustralya",
|
||||
"94": "Yeni Zelanda", "955": "Malezya", "958": "Makao",
|
||||
"977": "Süreli Yayınlar (ISSN)", "978-979": "Kitaplar (ISBN)",
|
||||
"980": "Para İade Kuponları", "981-984": "Kuponlar", "99": "Kuponlar",
|
||||
}
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# Yardımcı fonksiyonlar
|
||||
# ─────────────────────────────────────────────────────────
|
||||
|
||||
def _extract_price(p):
|
||||
"""Extract selling price from product, handling both old and Search API formats."""
|
||||
pr = p.get("price", {})
|
||||
if isinstance(pr, (int, float)):
|
||||
return pr
|
||||
return (pr.get("sellingPrice") or pr.get("discountedPrice")
|
||||
or pr.get("current") or pr.get("originalPrice")
|
||||
or pr.get("old") or 0)
|
||||
|
||||
|
||||
def _extract_rating(p):
|
||||
"""Extract average rating from product."""
|
||||
rating = p.get("ratingScore") or p.get("rating", 0)
|
||||
if isinstance(rating, dict):
|
||||
rating = rating.get("averageRating", 0)
|
||||
try:
|
||||
return float(rating) if rating else 0.0
|
||||
except (ValueError, TypeError):
|
||||
return 0.0
|
||||
|
||||
|
||||
def _extract_review_count(p):
|
||||
"""Extract review/comment count from product."""
|
||||
review_count = 0
|
||||
try:
|
||||
review_count = int(p.get("rating_count", 0) or 0)
|
||||
except (ValueError, TypeError, AttributeError):
|
||||
pass
|
||||
if not review_count:
|
||||
try:
|
||||
rating_obj = p.get("ratingScore") or p.get("rating", {})
|
||||
if isinstance(rating_obj, dict):
|
||||
review_count = int(
|
||||
rating_obj.get("totalCount", 0)
|
||||
or rating_obj.get("totalComments", 0)
|
||||
or 0
|
||||
)
|
||||
except (ValueError, TypeError, AttributeError):
|
||||
review_count = 0
|
||||
return review_count
|
||||
|
||||
|
||||
def _parse_social_value(value_str):
|
||||
"""Parse social proof value like '642', '1.2k', '10B+' etc."""
|
||||
try:
|
||||
s = str(value_str).strip()
|
||||
if "k" in s.lower():
|
||||
return int(float(s.lower().replace("k", "").replace("+", "")) * 1000)
|
||||
if "b+" in s.lower():
|
||||
return int(float(s.lower().replace("b+", "")) * 1_000_000_000)
|
||||
if "m+" in s.lower():
|
||||
return int(float(s.lower().replace("m+", "")) * 1_000_000)
|
||||
return int(s.replace("+", ""))
|
||||
except (ValueError, TypeError):
|
||||
return 0
|
||||
|
||||
|
||||
def _detect_barcode_country(prefix_num):
|
||||
"""Detect country from barcode prefix using BARCODE_COUNTRIES mapping."""
|
||||
for key, country in BARCODE_COUNTRIES.items():
|
||||
if "-" in key:
|
||||
start, end = key.split("-")
|
||||
try:
|
||||
range_len = len(start)
|
||||
prefix_to_check = prefix_num[:range_len] if len(prefix_num) >= range_len else prefix_num
|
||||
prefix_int = int(prefix_to_check) if prefix_to_check.isdigit() else -1
|
||||
if int(start) <= prefix_int <= int(end):
|
||||
return country
|
||||
except ValueError:
|
||||
continue
|
||||
elif key == prefix_num[:len(key)]:
|
||||
return country
|
||||
return "Bilinmiyor"
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# 1. normalize_product
|
||||
# ─────────────────────────────────────────────────────────
|
||||
|
||||
def normalize_product(raw_product, category_name, social_details):
|
||||
"""
|
||||
Ham ürünü flat yapıya dönüştür.
|
||||
Öncelik: inline socialProofs (Top Rankings) > enrichment API (social_details)
|
||||
"""
|
||||
product_id = raw_product.get("contentId") or raw_product.get("id")
|
||||
price = _extract_price(raw_product)
|
||||
rating = _extract_rating(raw_product)
|
||||
review_count = _extract_review_count(raw_product)
|
||||
|
||||
brand = raw_product.get("brand", {})
|
||||
brand_name = (brand.get("name") if isinstance(brand, dict) else brand) or "Bilinmeyen"
|
||||
|
||||
# ── Social proof: önce inline socialProofs, sonra enrichment ──
|
||||
orders, page_views, baskets, favorites = 0, 0, 0, 0
|
||||
|
||||
# İnline socialProofs (Top Rankings API — ürün dosyasında kayıtlı)
|
||||
social_proofs = raw_product.get("socialProofs", [])
|
||||
if isinstance(social_proofs, list):
|
||||
for proof in social_proofs:
|
||||
proof_type = proof.get("type", "")
|
||||
parsed = _parse_social_value(proof.get("value", "0"))
|
||||
if proof_type == "orderCountL3D":
|
||||
orders = parsed
|
||||
elif proof_type == "pageViewCount":
|
||||
page_views = parsed
|
||||
elif proof_type == "basketCount":
|
||||
baskets = parsed
|
||||
elif proof_type == "favoriteCount":
|
||||
favorites = parsed
|
||||
|
||||
# Enrichment API (social.json) — inline yoksa veya 0 ise fallback
|
||||
# Key hem str hem int olabilir (dosyadan str, memory'den int)
|
||||
sp = {}
|
||||
if product_id and social_details:
|
||||
sp = (social_details.get(str(product_id))
|
||||
or social_details.get(int(product_id) if str(product_id).isdigit() else -1)
|
||||
or {})
|
||||
if not orders:
|
||||
orders = sp.get("orders", 0) or 0
|
||||
if not page_views:
|
||||
page_views = sp.get("page_views", 0) or 0
|
||||
if not baskets:
|
||||
baskets = sp.get("baskets", 0) or 0
|
||||
if not favorites:
|
||||
favorites = sp.get("favorites", 0) or 0
|
||||
|
||||
# ── Image URL ──
|
||||
image_url = raw_product.get("imageUrl", "")
|
||||
if not image_url:
|
||||
images = raw_product.get("images", [])
|
||||
image_url = images[0] if isinstance(images, list) and images else ""
|
||||
|
||||
# ── Product URL ──
|
||||
product_url = raw_product.get("url", "")
|
||||
if not product_url and product_id:
|
||||
product_url = f"https://www.trendyol.com/p/{product_id}"
|
||||
|
||||
# ── Barcode ──
|
||||
barcode = ""
|
||||
winner_variant = raw_product.get("winnerVariant", {})
|
||||
if isinstance(winner_variant, dict):
|
||||
barcode = winner_variant.get("barcode", "")
|
||||
|
||||
# ── Country (origin) ──
|
||||
country_code = ""
|
||||
country_name = "Bilinmeyen"
|
||||
merchant_listings = raw_product.get("merchantListings", [])
|
||||
if merchant_listings:
|
||||
custom_values = merchant_listings[0].get("customValues", [])
|
||||
for cv in custom_values:
|
||||
if cv.get("key") == "origin":
|
||||
country_code = cv.get("value", "").upper()
|
||||
country_name = COUNTRY_NAMES.get(
|
||||
country_code, f"Diğer ({country_code})" if country_code else "Bilinmeyen"
|
||||
)
|
||||
break
|
||||
|
||||
return {
|
||||
"id": product_id,
|
||||
"name": raw_product.get("name", ""),
|
||||
"brand": brand_name,
|
||||
"category": category_name,
|
||||
"category_name": category_name, # Frontend uyumluluğu (ProductFinderTab, OpportunityTab)
|
||||
"price": round(price, 2) if price else 0,
|
||||
"rating": round(rating, 2),
|
||||
"review_count": review_count,
|
||||
"orders": orders,
|
||||
"page_views": page_views,
|
||||
"baskets": baskets,
|
||||
"favorites": favorites,
|
||||
"barcode": barcode,
|
||||
"country_code": country_code,
|
||||
"country": country_name,
|
||||
"image_url": image_url or "https://via.placeholder.com/150",
|
||||
"url": product_url,
|
||||
"in_stock": raw_product.get("inStock", False),
|
||||
}
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# 2. calculate_kpis
|
||||
# ─────────────────────────────────────────────────────────
|
||||
|
||||
def calculate_kpis(products):
|
||||
"""KPI hesaplaması (main.py 2182-2262 mantığı)."""
|
||||
total_products = len(products)
|
||||
prices = [p["price"] for p in products if p["price"] > 0]
|
||||
ratings = [p["rating"] for p in products if p["rating"] > 0]
|
||||
|
||||
avg_price = sum(prices) / len(prices) if prices else 0
|
||||
median_price = float(np.percentile(prices, 50)) if prices else 0
|
||||
min_price = min(prices) if prices else 0
|
||||
max_price = max(prices) if prices else 0
|
||||
|
||||
avg_rating = sum(ratings) / len(ratings) if ratings else 0
|
||||
low_rating_count = sum(1 for r in ratings if r < 3.0)
|
||||
low_rating_rate = (low_rating_count / len(ratings) * 100) if ratings else 0
|
||||
|
||||
unique_brands = set(p["brand"] for p in products if p["brand"] and p["brand"] != "Bilinmeyen")
|
||||
unique_subcategories = set(p["category"] for p in products if p["category"])
|
||||
|
||||
return {
|
||||
"total_products": total_products,
|
||||
"total_subcategories": len(unique_subcategories),
|
||||
"total_brands": len(unique_brands),
|
||||
"avg_price": round(avg_price, 2),
|
||||
"median_price": round(median_price, 2),
|
||||
"avg_rating": round(avg_rating, 2),
|
||||
"low_rating_count": low_rating_count,
|
||||
"low_rating_rate": round(low_rating_rate, 2),
|
||||
"min_price": round(min_price, 2),
|
||||
"max_price": round(max_price, 2),
|
||||
}
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# 3. calculate_charts
|
||||
# ─────────────────────────────────────────────────────────
|
||||
|
||||
def calculate_charts(products):
|
||||
"""Grafik verisi hesaplaması (main.py 2264-3248 mantığı)."""
|
||||
prices = [p["price"] for p in products if p["price"] > 0]
|
||||
total_products = len(products)
|
||||
|
||||
# ── Price distribution ──
|
||||
price_ranges = {"0-100": 0, "100-250": 0, "250-500": 0, "500-1000": 0, "1000+": 0}
|
||||
for price in prices:
|
||||
if price < 100:
|
||||
price_ranges["0-100"] += 1
|
||||
elif price < 250:
|
||||
price_ranges["100-250"] += 1
|
||||
elif price < 500:
|
||||
price_ranges["250-500"] += 1
|
||||
elif price < 1000:
|
||||
price_ranges["500-1000"] += 1
|
||||
else:
|
||||
price_ranges["1000+"] += 1
|
||||
|
||||
# ── Kategori ve marka grupları ──
|
||||
categories_data = defaultdict(list)
|
||||
brands_data = defaultdict(int)
|
||||
for p in products:
|
||||
categories_data[p["category"]].append(p)
|
||||
brands_data[p["brand"]] += 1
|
||||
|
||||
# ── Top categories (satışa göre sıralı) ──
|
||||
top_categories = []
|
||||
for cat_name, cat_products in categories_data.items():
|
||||
total_orders = sum(p["orders"] for p in cat_products)
|
||||
top_categories.append({
|
||||
"name": cat_name,
|
||||
"count": len(cat_products),
|
||||
"total_orders": total_orders,
|
||||
})
|
||||
top_categories = sorted(top_categories, key=lambda x: x["total_orders"], reverse=True)[:20]
|
||||
|
||||
# ── Top brands ──
|
||||
top_brands = sorted(
|
||||
[{"name": brand, "count": count} for brand, count in brands_data.items()],
|
||||
key=lambda x: x["count"], reverse=True,
|
||||
)[:20]
|
||||
|
||||
# ── Rating distribution ──
|
||||
rating_distribution = {"0-1": 0, "1-2": 0, "2-3": 0, "3-4": 0, "4-5": 0}
|
||||
for p in products:
|
||||
r = p["rating"]
|
||||
if r < 1:
|
||||
rating_distribution["0-1"] += 1
|
||||
elif r < 2:
|
||||
rating_distribution["1-2"] += 1
|
||||
elif r < 3:
|
||||
rating_distribution["2-3"] += 1
|
||||
elif r < 4:
|
||||
rating_distribution["3-4"] += 1
|
||||
else:
|
||||
rating_distribution["4-5"] += 1
|
||||
|
||||
# ── Brand price boxplot (top 10) ──
|
||||
brand_price_stats = []
|
||||
for brand_name in [b["name"] for b in top_brands[:10]]:
|
||||
bp = [p["price"] for p in products if p["brand"] == brand_name and p["price"] > 0]
|
||||
if bp and len(bp) >= 4:
|
||||
pcts = np.percentile(bp, [0, 25, 50, 75, 100])
|
||||
brand_price_stats.append({
|
||||
"brand": brand_name,
|
||||
"min": round(float(pcts[0]), 2),
|
||||
"q1": round(float(pcts[1]), 2),
|
||||
"median": round(float(pcts[2]), 2),
|
||||
"q3": round(float(pcts[3]), 2),
|
||||
"max": round(float(pcts[4]), 2),
|
||||
"count": len(bp),
|
||||
})
|
||||
|
||||
# ── Scatter plot (price vs rating) — sample 500 ──
|
||||
scatter_data = []
|
||||
sample_size = min(500, len(products))
|
||||
sampled = random.sample(products, sample_size) if products else []
|
||||
for p in sampled:
|
||||
if p["price"] > 0 and p["rating"] > 0:
|
||||
scatter_data.append({
|
||||
"price": p["price"],
|
||||
"rating": p["rating"],
|
||||
"brand": p["brand"],
|
||||
"in_stock": p["in_stock"],
|
||||
})
|
||||
|
||||
# ── Brand strength score ──
|
||||
brand_strength_scores = []
|
||||
for brand_name in [b["name"] for b in top_brands[:10]]:
|
||||
bp = [p for p in products if p["brand"] == brand_name]
|
||||
brand_count = len(bp)
|
||||
brand_share = (brand_count / total_products * 100) if total_products > 0 else 0
|
||||
brand_ratings = [p["rating"] for p in bp if p["rating"] > 0]
|
||||
brand_avg_rating = sum(brand_ratings) / len(brand_ratings) if brand_ratings else 0
|
||||
brand_out_of_stock = sum(1 for p in bp if not p["in_stock"])
|
||||
stockout_rate = (brand_out_of_stock / brand_count * 100) if brand_count > 0 else 0
|
||||
strength = brand_share + (brand_avg_rating * 5) - stockout_rate
|
||||
brand_strength_scores.append({
|
||||
"brand": brand_name,
|
||||
"share": round(brand_share, 2),
|
||||
"avg_rating": round(brand_avg_rating, 2),
|
||||
"stockout_rate": round(stockout_rate, 2),
|
||||
"strength_score": round(strength, 2),
|
||||
})
|
||||
brand_strength_scores.sort(key=lambda x: x["strength_score"], reverse=True)
|
||||
|
||||
# ── Heatmap: Brand × Category ──
|
||||
top_10_brands = [b["name"] for b in top_brands[:10]]
|
||||
top_10_cats = [c["name"] for c in top_categories[:10]]
|
||||
heatmap_data = []
|
||||
for cat_name in top_10_cats:
|
||||
cat_products = categories_data.get(cat_name, [])
|
||||
for brand_name in top_10_brands:
|
||||
count = sum(1 for p in cat_products if p["brand"] == brand_name)
|
||||
if count > 0:
|
||||
heatmap_data.append({"brand": brand_name, "category": cat_name, "value": count})
|
||||
|
||||
# ── Category price premium ──
|
||||
avg_price = sum(prices) / len(prices) if prices else 0
|
||||
category_price_analysis = []
|
||||
for cat_name, cat_products in categories_data.items():
|
||||
cp = [p["price"] for p in cat_products if p["price"] > 0]
|
||||
if cp:
|
||||
cat_avg = sum(cp) / len(cp)
|
||||
cat_median = float(np.percentile(cp, 50))
|
||||
premium = ((cat_avg - avg_price) / avg_price * 100) if avg_price > 0 else 0
|
||||
category_price_analysis.append({
|
||||
"category": cat_name,
|
||||
"avg_price": round(cat_avg, 2),
|
||||
"median_price": round(cat_median, 2),
|
||||
"price_premium": round(premium, 2),
|
||||
"product_count": len(cp),
|
||||
"min_price": round(min(cp), 2),
|
||||
"max_price": round(max(cp), 2),
|
||||
})
|
||||
category_price_analysis.sort(key=lambda x: x["price_premium"], reverse=True)
|
||||
most_expensive = [c for c in category_price_analysis if c["price_premium"] > 0][:10]
|
||||
most_affordable = [c for c in category_price_analysis if c["price_premium"] < 0][-10:]
|
||||
most_affordable.reverse()
|
||||
|
||||
# ── Origin analysis ──
|
||||
origin_counts = defaultdict(int)
|
||||
products_with_origin = 0
|
||||
for p in products:
|
||||
if p["country_code"]:
|
||||
origin_counts[p["country_code"]] += 1
|
||||
products_with_origin += 1
|
||||
|
||||
origin_country_data = sorted(
|
||||
[
|
||||
{
|
||||
"country_code": code,
|
||||
"country_name": COUNTRY_NAMES.get(code, f"Diğer ({code})"),
|
||||
"product_count": count,
|
||||
"percentage": round(count / products_with_origin * 100, 2) if products_with_origin else 0,
|
||||
}
|
||||
for code, count in origin_counts.items()
|
||||
],
|
||||
key=lambda x: x["product_count"], reverse=True,
|
||||
)
|
||||
|
||||
# ── Barcode analysis ──
|
||||
barcode_prefixes = defaultdict(int)
|
||||
barcode_countries_detected = defaultdict(int)
|
||||
products_with_barcode = 0
|
||||
for p in products:
|
||||
bc = p.get("barcode", "")
|
||||
if bc and len(bc) >= 3:
|
||||
products_with_barcode += 1
|
||||
prefix = bc[:3]
|
||||
barcode_prefixes[prefix] += 1
|
||||
detected = _detect_barcode_country(prefix)
|
||||
barcode_countries_detected[detected] += 1
|
||||
|
||||
barcode_prefix_data = sorted(
|
||||
[
|
||||
{
|
||||
"prefix": prefix,
|
||||
"detected_country": _detect_barcode_country(prefix),
|
||||
"product_count": count,
|
||||
"percentage": round(count / products_with_barcode * 100, 2) if products_with_barcode else 0,
|
||||
}
|
||||
for prefix, count in barcode_prefixes.items()
|
||||
],
|
||||
key=lambda x: x["product_count"], reverse=True,
|
||||
)[:20]
|
||||
|
||||
barcode_country_data = sorted(
|
||||
[
|
||||
{
|
||||
"country_name": country,
|
||||
"product_count": count,
|
||||
"percentage": round(count / products_with_barcode * 100, 2) if products_with_barcode else 0,
|
||||
}
|
||||
for country, count in barcode_countries_detected.items()
|
||||
],
|
||||
key=lambda x: x["product_count"], reverse=True,
|
||||
)
|
||||
|
||||
# ── Merchant analysis ──
|
||||
merchants_data = {}
|
||||
total_winners = 0
|
||||
products_with_merchant = 0
|
||||
# We need raw product data for merchant analysis — use the flat products
|
||||
# Merchant info is already lost in normalization, so we skip this in consolidator
|
||||
# The original code extracted from raw_product.merchantListings
|
||||
# For consolidated data, we'll build merchants from the products we have
|
||||
|
||||
# ── Build result ──
|
||||
return {
|
||||
"price_distribution": price_ranges,
|
||||
"top_categories": top_categories,
|
||||
"top_brands": top_brands,
|
||||
"rating_distribution": rating_distribution,
|
||||
"brand_price_boxplot": brand_price_stats,
|
||||
"price_rating_scatter": scatter_data,
|
||||
"brand_strength": brand_strength_scores,
|
||||
"brand_category_heatmap": heatmap_data,
|
||||
"category_price_premium": {
|
||||
"all_categories": category_price_analysis,
|
||||
"most_expensive": most_expensive,
|
||||
"most_affordable": most_affordable,
|
||||
},
|
||||
"origin_analysis": {
|
||||
"countries": origin_country_data,
|
||||
"top_countries": origin_country_data[:10],
|
||||
"total_products_with_origin": products_with_origin,
|
||||
"coverage_percentage": round(products_with_origin / total_products * 100, 2) if total_products else 0,
|
||||
},
|
||||
"barcode_analysis": {
|
||||
"prefixes": barcode_prefix_data,
|
||||
"countries_from_barcode": barcode_country_data,
|
||||
"top_countries_from_barcode": barcode_country_data[:10],
|
||||
"total_products_with_barcode": products_with_barcode,
|
||||
"coverage_percentage": round(products_with_barcode / total_products * 100, 2) if total_products else 0,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def _calculate_merchant_analysis(raw_products, categories_data):
|
||||
"""
|
||||
Satıcı analizini ham ürün verisinden hesapla (merchantListings alanı gerekli).
|
||||
raw_products: ham Trendyol ürün dict listesi, categories_data: {cat_name: [products]}
|
||||
"""
|
||||
merchants_data = {}
|
||||
total_winners = 0
|
||||
products_with_merchant = 0
|
||||
|
||||
for product in raw_products:
|
||||
merchant_listings = product.get("merchantListings", [])
|
||||
if not merchant_listings:
|
||||
continue
|
||||
ml = merchant_listings[0]
|
||||
merchant = ml.get("merchant", {})
|
||||
merchant_id = merchant.get("id")
|
||||
if not merchant_id:
|
||||
continue
|
||||
|
||||
products_with_merchant += 1
|
||||
if merchant_id not in merchants_data:
|
||||
merchant_name = merchant.get("name") or merchant.get("officialName") or f"Satıcı {merchant_id}"
|
||||
merchants_data[merchant_id] = {
|
||||
"merchant_id": merchant_id,
|
||||
"merchant_name": merchant_name,
|
||||
"product_count": 0,
|
||||
"total_price": 0,
|
||||
"winner_count": 0,
|
||||
}
|
||||
|
||||
merchants_data[merchant_id]["product_count"] += 1
|
||||
price = _extract_price(product)
|
||||
if price > 0:
|
||||
merchants_data[merchant_id]["total_price"] += price
|
||||
if ml.get("isWinner"):
|
||||
merchants_data[merchant_id]["winner_count"] += 1
|
||||
total_winners += 1
|
||||
|
||||
merchant_list = []
|
||||
for mid, data in merchants_data.items():
|
||||
avg_price = data["total_price"] / data["product_count"] if data["product_count"] > 0 else 0
|
||||
winner_ratio = (data["winner_count"] / data["product_count"] * 100) if data["product_count"] > 0 else 0
|
||||
merchant_url = None
|
||||
if data["merchant_name"] and not data["merchant_name"].startswith("Satıcı "):
|
||||
merchant_url = f"https://www.trendyol.com/magaza/{data['merchant_name'].lower().replace(' ', '-')}-m-{mid}"
|
||||
merchant_list.append({
|
||||
"merchant_id": mid,
|
||||
"merchant_name": data["merchant_name"],
|
||||
"merchant_url": merchant_url,
|
||||
"product_count": data["product_count"],
|
||||
"avg_price": round(avg_price, 2),
|
||||
"winner_count": data["winner_count"],
|
||||
"winner_ratio": round(winner_ratio, 2),
|
||||
})
|
||||
|
||||
merchant_list.sort(key=lambda x: x["product_count"], reverse=True)
|
||||
total_products = len(raw_products)
|
||||
total_merchants = len(merchants_data)
|
||||
winner_percentage = (total_winners / products_with_merchant * 100) if products_with_merchant > 0 else 0
|
||||
|
||||
return {
|
||||
"merchants": merchant_list,
|
||||
"top_merchants": merchant_list[:20],
|
||||
"total_merchants": total_merchants,
|
||||
"total_products_with_merchant": products_with_merchant,
|
||||
"total_winners": total_winners,
|
||||
"winner_percentage": round(winner_percentage, 2),
|
||||
"coverage_percentage": round(products_with_merchant / total_products * 100, 2) if total_products else 0,
|
||||
}
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# 4. calculate_insights
|
||||
# ─────────────────────────────────────────────────────────
|
||||
|
||||
def calculate_insights(products):
|
||||
"""Low-rating ürünler ve fiyat anomalileri."""
|
||||
# ── Low rating products ──
|
||||
low_rating = []
|
||||
for p in products:
|
||||
if 0 < p["rating"] < 3.0:
|
||||
low_rating.append({
|
||||
"name": p["name"][:50],
|
||||
"brand": p["brand"],
|
||||
"rating": p["rating"],
|
||||
"price": p["price"],
|
||||
"in_stock": p["in_stock"],
|
||||
})
|
||||
low_rating = sorted(low_rating, key=lambda x: x["rating"])[:20]
|
||||
|
||||
# ── Anomalies (IQR) ──
|
||||
prices = [p["price"] for p in products if p["price"] > 0]
|
||||
anomalies = []
|
||||
if len(prices) > 4:
|
||||
q1, q3 = np.percentile(prices, [25, 75])
|
||||
iqr = q3 - q1
|
||||
lower = q1 - 1.5 * iqr
|
||||
upper = q3 + 1.5 * iqr
|
||||
for p in products:
|
||||
if p["price"] > 0 and (p["price"] < lower or p["price"] > upper):
|
||||
anomalies.append({
|
||||
"name": p["name"][:50],
|
||||
"brand": p["brand"],
|
||||
"price": p["price"],
|
||||
"type": "expensive" if p["price"] > upper else "cheap",
|
||||
})
|
||||
anomalies = sorted(anomalies, key=lambda x: x["price"], reverse=True)[:20]
|
||||
|
||||
return {"low_rating_products": low_rating, "anomalies": anomalies}
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# 5. build_consolidated_report (ana orkestratör)
|
||||
# ─────────────────────────────────────────────────────────
|
||||
|
||||
def build_consolidated_report(report_id, db, reports_dir, social_data=None):
|
||||
"""
|
||||
Rapor verisini yükle → normalize et → hesapla → döndür.
|
||||
|
||||
Args:
|
||||
report_id: DB rapor ID
|
||||
db: SQLAlchemy session
|
||||
reports_dir: reports/ klasör yolu
|
||||
social_data: Enrichment social.json verisi (opsiyonel, yoksa dosyadan okunur)
|
||||
Returns:
|
||||
Konsolide dashboard dict
|
||||
"""
|
||||
from database import Report
|
||||
t0 = time.time()
|
||||
|
||||
report = db.query(Report).filter(Report.id == report_id).first()
|
||||
if not report:
|
||||
return None
|
||||
if not report.json_file_path or not os.path.exists(report.json_file_path):
|
||||
return None
|
||||
|
||||
# Rapor meta verisini oku
|
||||
with open(report.json_file_path, "r", encoding="utf-8") as f:
|
||||
report_data = json.load(f)
|
||||
|
||||
# Social proof verisini yükle
|
||||
social_details = {}
|
||||
if social_data:
|
||||
social_details = social_data.get("details", {})
|
||||
else:
|
||||
social_file = os.path.join(reports_dir, f"enrich_{report_id}", "social.json")
|
||||
if os.path.exists(social_file):
|
||||
try:
|
||||
with open(social_file, "r", encoding="utf-8") as f:
|
||||
soc = json.load(f)
|
||||
social_details = soc.get("details", {})
|
||||
except Exception as e:
|
||||
log.warning(f"Social proof dosyası okunamadı: {e}")
|
||||
|
||||
# ── Ham ürünleri yükle ve normalize et ──
|
||||
normalized_products = []
|
||||
raw_products_all = [] # Merchant analizi için ham verileri tut
|
||||
|
||||
for detail in report_data.get("details", []):
|
||||
if not detail.get("success") or not detail.get("file_path"):
|
||||
continue
|
||||
file_path = detail["file_path"]
|
||||
if not os.path.exists(file_path):
|
||||
continue
|
||||
try:
|
||||
with open(file_path, "r", encoding="utf-8") as f:
|
||||
cat_data = json.load(f)
|
||||
raw_products = cat_data.get("products", [])
|
||||
cat_name_raw = detail.get("category_name", "")
|
||||
cat_name = re.sub(r'\s+\d+$', '', cat_name_raw)
|
||||
|
||||
for raw in raw_products:
|
||||
# Set category on raw product for load_report_products compatibility
|
||||
if isinstance(raw.get("category"), dict):
|
||||
raw["category"]["name"] = cat_name
|
||||
else:
|
||||
raw["category"] = {"id": 0, "name": cat_name}
|
||||
|
||||
norm = normalize_product(raw, cat_name, social_details)
|
||||
if norm["price"] and norm["category"]:
|
||||
normalized_products.append(norm)
|
||||
|
||||
raw_products_all.extend(raw_products)
|
||||
except (json.JSONDecodeError, OSError, KeyError) as e:
|
||||
log.warning(f"Kategori dosyası okunamadı: {file_path}: {e}")
|
||||
continue
|
||||
|
||||
if not normalized_products:
|
||||
log.warning(f"Rapor {report_id} için ürün bulunamadı")
|
||||
return None
|
||||
|
||||
# ── Hesaplamalar ──
|
||||
kpis = calculate_kpis(normalized_products)
|
||||
charts = calculate_charts(normalized_products)
|
||||
insights = calculate_insights(normalized_products)
|
||||
|
||||
# Merchant analysis (ham veri gerekli)
|
||||
charts["merchant_analysis"] = _calculate_merchant_analysis(raw_products_all, {})
|
||||
|
||||
elapsed = time.time() - t0
|
||||
log.info(f"Rapor {report_id} konsolide edildi: {len(normalized_products)} ürün, {elapsed:.2f}s")
|
||||
|
||||
return {
|
||||
"metadata": {
|
||||
"report_id": report_id,
|
||||
"report_name": report.name,
|
||||
"created_at": report.created_at.isoformat() if report.created_at else None,
|
||||
"total_products": len(normalized_products),
|
||||
"total_categories": kpis["total_subcategories"],
|
||||
"consolidated_at": datetime.now().isoformat(),
|
||||
},
|
||||
"report_id": report_id,
|
||||
"report_name": report.name,
|
||||
"products": normalized_products,
|
||||
"all_products": normalized_products, # Geriye uyumluluk (frontend "all_products" bekliyor)
|
||||
"kpis": kpis,
|
||||
"charts": charts,
|
||||
"insights": insights,
|
||||
}
|
||||
|
||||
|
||||
# ─────────────────────────────────────────────────────────
|
||||
# 6. save / load
|
||||
# ─────────────────────────────────────────────────────────
|
||||
|
||||
def save_consolidated_report(report_id, data, reports_dir):
|
||||
"""Konsolide veriyi reports/report_{id}_data.json olarak kaydet."""
|
||||
path = os.path.join(reports_dir, f"report_{report_id}_data.json")
|
||||
os.makedirs(os.path.dirname(path), exist_ok=True)
|
||||
with open(path, "w", encoding="utf-8") as f:
|
||||
json.dump(data, f, ensure_ascii=False)
|
||||
log.info(f"Konsolide rapor kaydedildi: {path}")
|
||||
return path
|
||||
|
||||
|
||||
def load_consolidated_report(report_id, reports_dir):
|
||||
"""Konsolide dosya varsa oku, yoksa None döndür."""
|
||||
path = os.path.join(reports_dir, f"report_{report_id}_data.json")
|
||||
if os.path.exists(path):
|
||||
try:
|
||||
with open(path, "r", encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
except (json.JSONDecodeError, OSError) as e:
|
||||
log.warning(f"Konsolide dosya okunamadı: {path}: {e}")
|
||||
return None
|
||||
@@ -6,6 +6,9 @@ from sqlalchemy.ext.declarative import declarative_base
|
||||
from sqlalchemy.orm import sessionmaker, relationship
|
||||
from datetime import datetime
|
||||
import os
|
||||
from logging_config import get_logger
|
||||
|
||||
log = get_logger("db")
|
||||
|
||||
# PostgreSQL database - configurable via environment variable
|
||||
# Default: Local PostgreSQL for development
|
||||
@@ -26,6 +29,7 @@ class Category(Base):
|
||||
parent_id = Column(Integer, ForeignKey('categories.id'), nullable=True)
|
||||
trendyol_category_id = Column(Integer, nullable=True)
|
||||
trendyol_url = Column(String, nullable=True)
|
||||
path_model = Column(String, nullable=True) # URL slug for search API (e.g. "elbise-x-c56")
|
||||
is_active = Column(Boolean, default=True)
|
||||
created_at = Column(DateTime, default=datetime.utcnow)
|
||||
|
||||
@@ -86,7 +90,7 @@ class EnrichmentError(Base):
|
||||
def init_db():
|
||||
"""Initialize database - create tables"""
|
||||
Base.metadata.create_all(bind=engine)
|
||||
print("✅ Database initialized successfully!")
|
||||
log.info("Database initialized successfully")
|
||||
|
||||
|
||||
def get_db():
|
||||
|
||||
@@ -8,6 +8,9 @@ from pytrends.request import TrendReq
|
||||
from typing import Dict, Optional
|
||||
from datetime import datetime, timedelta
|
||||
import time
|
||||
from logging_config import get_logger
|
||||
|
||||
log = get_logger("trends")
|
||||
|
||||
|
||||
class GoogleTrendsCache:
|
||||
@@ -135,12 +138,12 @@ def fetch_google_trends(product_name: str, retries: int = 3) -> Dict:
|
||||
|
||||
except Exception as e:
|
||||
error_msg = str(e)
|
||||
print(f"Google Trends API Error (attempt {attempt + 1}/{retries}): {error_msg}")
|
||||
log.warning(f"Google Trends API Error (attempt {attempt + 1}/{retries}): {error_msg}")
|
||||
|
||||
# Rate limit error - wait longer
|
||||
if '429' in error_msg or 'rate' in error_msg.lower():
|
||||
wait_time = 5 * (attempt + 1) # 5, 10, 15 seconds
|
||||
print(f"Rate limited. Waiting {wait_time} seconds...")
|
||||
log.warning(f"Rate limited. Waiting {wait_time} seconds...")
|
||||
time.sleep(wait_time)
|
||||
continue
|
||||
|
||||
|
||||
197
backend/logging_config.py
Normal file
197
backend/logging_config.py
Normal file
@@ -0,0 +1,197 @@
|
||||
"""
|
||||
Structured Logging Configuration for Trendyol Product Dashboard
|
||||
|
||||
Provides:
|
||||
- JSON structured logs to file (for machine parsing)
|
||||
- Colored console logs (for human reading)
|
||||
- Correlation ID tracking per request/report
|
||||
- Rotating file handlers with size limits
|
||||
- Timing context manager for operation profiling
|
||||
"""
|
||||
|
||||
import logging
|
||||
import logging.handlers
|
||||
import json
|
||||
import os
|
||||
import time
|
||||
from contextvars import ContextVar
|
||||
from contextlib import contextmanager
|
||||
from datetime import datetime, timezone
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Context variables for log correlation
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_correlation_id: ContextVar[str] = ContextVar("correlation_id", default="-")
|
||||
_report_id: ContextVar[str] = ContextVar("report_id", default="-")
|
||||
|
||||
|
||||
def set_correlation_id(cid: str):
|
||||
_correlation_id.set(cid)
|
||||
|
||||
|
||||
def get_correlation_id() -> str:
|
||||
return _correlation_id.get()
|
||||
|
||||
|
||||
def set_report_id(rid):
|
||||
_report_id.set(str(rid) if rid is not None else "-")
|
||||
|
||||
|
||||
def get_report_id() -> str:
|
||||
return _report_id.get()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# JSON Formatter (file output)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class JSONFormatter(logging.Formatter):
|
||||
"""Structured JSON log formatter for file output."""
|
||||
|
||||
def format(self, record: logging.LogRecord) -> str:
|
||||
log_entry = {
|
||||
"ts": datetime.now(timezone.utc).isoformat(),
|
||||
"level": record.levelname,
|
||||
"logger": record.name,
|
||||
"msg": record.getMessage(),
|
||||
"correlation_id": get_correlation_id(),
|
||||
"report_id": get_report_id(),
|
||||
}
|
||||
|
||||
# Add extra fields if present
|
||||
for key in ("url", "status_code", "response_time_ms", "response_size",
|
||||
"error_type", "duration_ms", "cb_state", "failures",
|
||||
"batch_size", "product_count", "cache_size"):
|
||||
val = getattr(record, key, None)
|
||||
if val is not None:
|
||||
log_entry[key] = val
|
||||
|
||||
# Add exception info
|
||||
if record.exc_info and record.exc_info[0] is not None:
|
||||
log_entry["exception"] = self.formatException(record.exc_info)
|
||||
|
||||
return json.dumps(log_entry, ensure_ascii=False, default=str)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Console Formatter (colored, human-readable)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_LEVEL_COLORS = {
|
||||
"DEBUG": "\033[36m", # cyan
|
||||
"INFO": "\033[32m", # green
|
||||
"WARNING": "\033[33m", # yellow
|
||||
"ERROR": "\033[31m", # red
|
||||
"CRITICAL": "\033[1;31m", # bold red
|
||||
}
|
||||
_RESET = "\033[0m"
|
||||
|
||||
|
||||
class ConsoleFormatter(logging.Formatter):
|
||||
"""Colored, human-readable console formatter."""
|
||||
|
||||
def format(self, record: logging.LogRecord) -> str:
|
||||
color = _LEVEL_COLORS.get(record.levelname, "")
|
||||
ts = datetime.now().strftime("%H:%M:%S")
|
||||
level = record.levelname[0] # D, I, W, E, C
|
||||
report = get_report_id()
|
||||
report_tag = f" [r:{report}]" if report != "-" else ""
|
||||
|
||||
msg = record.getMessage()
|
||||
base = f"{color}{ts} [{level}]{report_tag} {msg}{_RESET}"
|
||||
|
||||
if record.exc_info and record.exc_info[0] is not None:
|
||||
base += "\n" + self.formatException(record.exc_info)
|
||||
|
||||
return base
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Setup function
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def setup_logging(log_dir: str = None):
|
||||
"""
|
||||
Configure the entire logging system. Call once at startup.
|
||||
|
||||
Creates:
|
||||
- logs/trendyol.log (all levels, JSON, 10MB x 5 rotation)
|
||||
- logs/errors.log (WARNING+, JSON, 10MB x 3 rotation)
|
||||
- console output (INFO+, colored)
|
||||
"""
|
||||
if log_dir is None:
|
||||
log_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "logs")
|
||||
|
||||
os.makedirs(log_dir, exist_ok=True)
|
||||
|
||||
root = logging.getLogger("trendyol")
|
||||
root.setLevel(logging.DEBUG)
|
||||
|
||||
# Prevent duplicate handlers on reload
|
||||
if root.handlers:
|
||||
return
|
||||
|
||||
json_fmt = JSONFormatter()
|
||||
console_fmt = ConsoleFormatter()
|
||||
|
||||
# 1. Main log file — all levels, JSON
|
||||
main_handler = logging.handlers.RotatingFileHandler(
|
||||
os.path.join(log_dir, "trendyol.log"),
|
||||
maxBytes=10 * 1024 * 1024, # 10 MB
|
||||
backupCount=5,
|
||||
encoding="utf-8",
|
||||
)
|
||||
main_handler.setLevel(logging.DEBUG)
|
||||
main_handler.setFormatter(json_fmt)
|
||||
root.addHandler(main_handler)
|
||||
|
||||
# 2. Error log file — WARNING+, JSON
|
||||
error_handler = logging.handlers.RotatingFileHandler(
|
||||
os.path.join(log_dir, "errors.log"),
|
||||
maxBytes=10 * 1024 * 1024,
|
||||
backupCount=3,
|
||||
encoding="utf-8",
|
||||
)
|
||||
error_handler.setLevel(logging.WARNING)
|
||||
error_handler.setFormatter(json_fmt)
|
||||
root.addHandler(error_handler)
|
||||
|
||||
# 3. Console — INFO+, colored
|
||||
console_handler = logging.StreamHandler()
|
||||
console_handler.setLevel(logging.INFO)
|
||||
console_handler.setFormatter(console_fmt)
|
||||
root.addHandler(console_handler)
|
||||
|
||||
# Quiet noisy libraries
|
||||
logging.getLogger("urllib3").setLevel(logging.WARNING)
|
||||
logging.getLogger("sqlalchemy").setLevel(logging.WARNING)
|
||||
logging.getLogger("sqlalchemy.engine").setLevel(logging.WARNING)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Logger factory
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def get_logger(name: str) -> logging.Logger:
|
||||
"""Get a namespaced logger: trendyol.<name>"""
|
||||
return logging.getLogger(f"trendyol.{name}")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Timing context manager
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@contextmanager
|
||||
def log_timing(logger: logging.Logger, operation: str, level=logging.INFO, **extra):
|
||||
"""Context manager that logs operation duration."""
|
||||
start = time.monotonic()
|
||||
try:
|
||||
yield
|
||||
finally:
|
||||
elapsed_ms = round((time.monotonic() - start) * 1000, 1)
|
||||
logger.log(
|
||||
level,
|
||||
f"{operation} completed in {elapsed_ms}ms",
|
||||
extra={"duration_ms": elapsed_ms, **extra},
|
||||
)
|
||||
1714
backend/main.py
1714
backend/main.py
File diff suppressed because it is too large
Load Diff
@@ -10,6 +10,9 @@ import math
|
||||
import os
|
||||
from typing import Dict, List, Any, Optional
|
||||
from datetime import datetime
|
||||
from logging_config import get_logger
|
||||
|
||||
log = get_logger("scraper")
|
||||
|
||||
|
||||
class TrendyolScraper:
|
||||
@@ -55,7 +58,7 @@ class TrendyolScraper:
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except requests.exceptions.RequestException as e:
|
||||
print(f"❌ Sayfa {page} error: {e}")
|
||||
log.warning(f"Sayfa {page} error: {e}")
|
||||
return None
|
||||
|
||||
def get_total_count(self) -> int:
|
||||
@@ -96,7 +99,7 @@ class TrendyolScraper:
|
||||
# Sayfa sayısını hesapla
|
||||
total_pages = self.calculate_total_pages(total_count, max_pages)
|
||||
|
||||
print(f"📦 Kategori {self.category_id}: {total_count} ürün, {total_pages} sayfa çekilecek")
|
||||
log.info(f"Kategori {self.category_id}: {total_count} ürün, {total_pages} sayfa çekilecek")
|
||||
|
||||
# Sayfaları çek
|
||||
all_products = []
|
||||
@@ -105,7 +108,7 @@ class TrendyolScraper:
|
||||
data = self.fetch_page(page)
|
||||
|
||||
if not data or not data.get('isSuccess'):
|
||||
print(f"⚠️ Sayfa {page} atlandı")
|
||||
log.warning(f"Sayfa {page} atlandı")
|
||||
continue
|
||||
|
||||
products = data.get('products', [])
|
||||
@@ -144,7 +147,7 @@ class TrendyolScraper:
|
||||
|
||||
return True
|
||||
except Exception as e:
|
||||
print(f"❌ Dosya kaydetme hatası: {e}")
|
||||
log.error(f"Dosya kaydetme hatası: {e}")
|
||||
return False
|
||||
|
||||
def get_category_info(self) -> Optional[Dict[str, Any]]:
|
||||
@@ -157,6 +160,112 @@ class TrendyolScraper:
|
||||
return data.get('categoryInfo', {})
|
||||
|
||||
|
||||
class TrendyolSearchScraper:
|
||||
"""Trendyol Search API ile ürün çeker — tüm kategori tipleri için çalışır (-c ve -s)"""
|
||||
|
||||
API_BASE_URL = "https://apigw.trendyol.com/discovery-sfint-search-service/api/search/products"
|
||||
|
||||
def __init__(self, path_model: str, page_size: int = 24):
|
||||
self.path_model = path_model
|
||||
self.page_size = page_size
|
||||
self.headers = {
|
||||
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
|
||||
"Accept": "application/json",
|
||||
"Referer": f"https://www.trendyol.com/{path_model}",
|
||||
"Origin": "https://www.trendyol.com"
|
||||
}
|
||||
self.cookies = {
|
||||
"storefrontId": "1",
|
||||
"language": "tr",
|
||||
"countryCode": "TR"
|
||||
}
|
||||
|
||||
def fetch_page(self, page: int) -> Optional[Dict[str, Any]]:
|
||||
"""Tek sayfa çeker"""
|
||||
params = {
|
||||
"pathModel": self.path_model,
|
||||
"pi": page,
|
||||
"ps": self.page_size,
|
||||
"channelId": 1,
|
||||
"storefrontId": 1,
|
||||
"culture": "tr-TR"
|
||||
}
|
||||
try:
|
||||
response = requests.get(
|
||||
self.API_BASE_URL,
|
||||
params=params,
|
||||
headers=self.headers,
|
||||
cookies=self.cookies,
|
||||
timeout=15
|
||||
)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except requests.exceptions.RequestException as e:
|
||||
log.warning(f"Search API sayfa {page} error ({self.path_model}): {e}")
|
||||
return None
|
||||
|
||||
def fetch_all_products(self, delay: float = 1.0, max_pages: int = 10) -> List[Dict[str, Any]]:
|
||||
"""Tüm ürünleri çeker, normalize eder (max_pages=10 x page_size=24 = 240 ürün)"""
|
||||
first = self.fetch_page(1)
|
||||
if not first:
|
||||
return []
|
||||
|
||||
total = first.get("total", 0) or first.get("totalCount", 0) or first.get("roughTotal", 0)
|
||||
raw_products = first.get("products", [])
|
||||
|
||||
if total == 0 and not raw_products:
|
||||
return []
|
||||
|
||||
# total 0 olsa bile ürün varsa en az 1 sayfa çek
|
||||
if total == 0 and raw_products:
|
||||
total = len(raw_products)
|
||||
|
||||
total_pages = min(math.ceil(total / self.page_size), max_pages)
|
||||
log.info(f"Search API {self.path_model}: {total} ürün, {total_pages} sayfa çekilecek")
|
||||
|
||||
for page in range(2, total_pages + 1):
|
||||
data = self.fetch_page(page)
|
||||
if data and data.get("products"):
|
||||
raw_products.extend(data["products"])
|
||||
if page < total_pages:
|
||||
time.sleep(delay)
|
||||
|
||||
return [_normalize_search_product(p) for p in raw_products]
|
||||
|
||||
|
||||
def _normalize_search_product(raw: dict) -> dict:
|
||||
"""Search API ürün formatını mevcut sisteme uyumlu hale getir"""
|
||||
brand = raw.get("brand", {})
|
||||
if isinstance(brand, str):
|
||||
brand = {"name": brand}
|
||||
|
||||
price = raw.get("price", {})
|
||||
if isinstance(price, (int, float)):
|
||||
price = {"sellingPrice": price, "originalPrice": price}
|
||||
elif isinstance(price, dict) and "sellingPrice" not in price:
|
||||
# Search API returns current/discountedPrice/originalPrice — map to sellingPrice
|
||||
price["sellingPrice"] = price.get("discountedPrice") or price.get("current") or price.get("originalPrice") or price.get("old") or 0
|
||||
|
||||
rating = raw.get("ratingScore", {})
|
||||
if rating is None:
|
||||
rating = {}
|
||||
|
||||
return {
|
||||
"id": raw.get("id") or raw.get("contentId"),
|
||||
"name": raw.get("name", ""),
|
||||
"brand": brand,
|
||||
"price": price,
|
||||
"ratingScore": rating,
|
||||
"url": raw.get("url", ""),
|
||||
"imageUrl": raw.get("image", raw.get("imageUrl", "")),
|
||||
"merchantListings": raw.get("merchantListings", []),
|
||||
"winnerVariant": raw.get("winnerVariant", {}),
|
||||
"socialProofs": raw.get("socialProofs", []),
|
||||
"categoryId": raw.get("categoryId"),
|
||||
"categoryName": raw.get("categoryName"),
|
||||
}
|
||||
|
||||
|
||||
def scrape_category(category_id: int, category_name: str, output_dir: str = "../categories") -> Dict[str, Any]:
|
||||
"""
|
||||
Tek bir kategoriyi çeker
|
||||
@@ -227,9 +336,7 @@ def scrape_multiple_categories(categories: List[tuple], delay: float = 2.0) -> D
|
||||
}
|
||||
|
||||
for i, (cat_id, cat_name) in enumerate(categories, 1):
|
||||
print(f"\n{'='*80}")
|
||||
print(f"📂 [{i}/{len(categories)}] {cat_name} (ID: {cat_id})")
|
||||
print('='*80)
|
||||
log.info(f"[{i}/{len(categories)}] {cat_name} (ID: {cat_id})")
|
||||
|
||||
result = scrape_category(cat_id, cat_name)
|
||||
results["details"].append(result)
|
||||
@@ -237,10 +344,10 @@ def scrape_multiple_categories(categories: List[tuple], delay: float = 2.0) -> D
|
||||
if result["success"]:
|
||||
results["successful"] += 1
|
||||
results["total_products"] += result["total_products"]
|
||||
print(f"✅ Başarılı: {result['total_products']} ürün")
|
||||
log.info(f"Başarılı: {result['total_products']} ürün")
|
||||
else:
|
||||
results["failed"] += 1
|
||||
print(f"❌ Hata: {result['error']}")
|
||||
log.error(f"Hata: {result['error']}")
|
||||
|
||||
# Kategoriler arası bekleme
|
||||
if i < len(categories):
|
||||
|
||||
Reference in New Issue
Block a user