Earnings Call Analyser | Jevan Cousins

Overview

Started in Q1 2026. The project tests the result from Chiang et al. 2025 that alignment between management answers and analyst questions in earnings calls correlates with forward equity returns. The pipeline ingests earnings call transcripts, extracts and embeds Q&A pairs using FinBERT, trains a PyTorch contrastive-learning classifier to measure alignment, and backtests the signal against forward returns. Architecture is complete; the next step is building a labelled training set so the classifier can actually be fit.

Problem

Most equity signals rely on price action or accounting data. Earnings calls are rich with linguistic and semantic information. If management and analysts are misaligned (management provides evasive or off-topic answers), does that predict returns? Chiang et al. suggest yes. This project validates the hypothesis on US equities.

Key Decisions and Trade-offs

FinBERT for embeddings. ProsusAI/FinBERT is domain-specific for financial text. Outperforms generic BERT for understanding financial language.
Contrastive learning architecture. Q&A pairs generate positive samples; Q-A' pairs (mismatched) generate negative samples. Minimises embedding distance for aligned pairs, maximises for misaligned. Simpler than multi-class classification.
Combination loss function. Cross-entropy (classification) + MSE (alignment scores) + SimCSE-inspired contrastive loss. Trade-off is tuning complexity; payoff is richer signal.
Kubernetes for deployment. Containerised API and dashboard for reproducibility and scaling. Overkill for MVP but enables production validation.

Stack and Why

Layer	Technology	Rationale
ML Framework	PyTorch 2.x	Research flexibility, debugging, academic standard.
Embeddings	ProsusAI/FinBERT	Domain-specific financial BERT, out-of-the-box fine-tuning ready.
Transformer Library	HuggingFace Transformers	FinBERT integration, preprocessing, model management.
Backend API	FastAPI	Async endpoints for analysis, company comparison, backtesting.
Database	PostgreSQL	Transcript storage, embedding persistence, query efficiency.
Frontend	Streamlit	Quick interactive dashboards. Plotly for alignment timelines and returns.
Deployment	Docker, Kubernetes (Minikube)	Reproducible containers, Kubernetes for scaling and orchestration.
Data Sources	Financial Modeling Prep API, SEC EDGAR, yfinance	Transcripts, returns data, financial metrics.

What Shipped

Transcript ingestion: Fetch from Financial Modeling Prep API and SEC EDGAR. Parse Q&A segments.
Embedding pipeline: FinBERT embeddings for questions and answers. Dimension: 768.
Alignment classifier: PyTorch neural network with question/answer projection heads. Combined loss function. Outputs alignment score 0-1.
Question categorisation: Topics: margins, guidance, competition, macro, capital structure. Multi-label classification.
Backtesting framework: Historical alignment scores vs forward 20/60-day returns. Sharpe ratio, information ratio, drawdown analysis.
API and dashboard: FastAPI backend with analysis endpoints. Streamlit frontend with alignment timeline, sector heatmaps, company rankings.

Metrics (Planned)

Alignment score correlation with forward returns (target: > 0.15)
Backtesting Sharpe ratio (target: > 0.8)
Classification accuracy for question categorisation (target: > 85%)
Model training on 500+ earnings calls covering S&P 500 constituents

What Is Next

Labelled training data. Manually score 100 to 200 Q&A pairs with alignment labels. This is the critical blocker: the classifier cannot be fit without it.
Train the AlignmentClassifier on the labelled set, tuning the combined loss.
Backtest the alignment signal against 2020 to 2025 earnings seasons, evaluating correlation with forward 20-day and 60-day returns.
Attribute the signal by question topic (margins, guidance, competition, macro, capital structure) to understand what's actually driving it.
Expand to international equities if the US signal validates; consider integration with portfolio workflows.

If the signal validates (target correlation > 0.15, target Sharpe > 0.8), this becomes a personal trading strategy. If it doesn't, the architecture (FinBERT + contrastive learning + backtesting) generalises to other financial NLP problems I care about.

Status: Proof of Concept, started Q1 2026. Based on Chiang et al. 2025 research.

Learn more: GitHub repository