Beyond statistical changepoint detection: Semantic interpretation of time series via large language models
2026
Changepoint detection algorithms identify where structural breaks occur but are conventionally used under a one-to-one mapping between detected breaks and real-world events. We show this mapping assumption is undermined by a fundamental ambiguity: the confidence interval for a detected break widens as the slope jump shrinks, so a wide interval may indicate either a mild genuine break or an approximation artifact from fitting piecewise-linear segments to nonlinear dynamics. This ambiguity is not identifiable from the time series alone. Hence, we propose a different paradigm, treating the ℓ0 changepoint output as a sparse piecewise-linear representation whose slope transitions and confidence intervals serve as structured inputs for LLM semantic interpretation, grounded by in-context learning examples and external knowledge retrieval. The LLM classifies patterns into isolated structural breaks, coherent multi-changepoint structures, and nonlinear dynamic transitions. On two FRED economic time series, our framework achieves perfect recall against NBER recession dates while recovering semantic structures—such as grouping four ambiguous Volcker-era changepoints into one coherent event—that traditional methods detect but cannot interpret.
Research areas