Platform for computational script
A platform for computational script analysis for residencies and cinematographic institutions

Cinema has an ecosystem of script residencies, calls for entries, and project development programs that generate a considerable volume of textual material each year. Scripts in different stages of writing, treatments, successive versions of the same project—documents containing structural, narrative, and thematic information that is rarely analyzed systematically and at a distance. Screenplay Analytics was born in this context. Artefacto developed it as an assisted reading tool for institutions that support writing processes, with the goal of shifting the gaze from immediate judgment toward the analysis of patterns that only emerge when observing a complete corpus.

The platform is built in TypeScript with React and is organized into specialized visualization modules. A multi-user authentication system allows different institutions to access their own datasets. The central analytical work consists of decomposing the script scene by scene and extracting, from each one, a broad set of variables using large-scale language models. For each scene, the system identifies the estimated duration, whether the action takes place indoors or outdoors, the time of day, the specific location, the characters present, their actions and dramatic objectives, active emotions, explicit and implicit themes, semantic fields with their relative weights, visual and sound elements, props with narrative value, unresolved elements, and dialogues with their subtext. It also calculates similarities between scenes based on shared elements and constructs, at a global level, an analysis of each character, their transformations throughout the script, their relationships with other characters, and the evolution of those relationships in terms of tension.
The result is a dense data structure that converts the script text into a relational map of its own components. The visualization modules work on this data structure. A multi-layer narrative timeline shows the full sequence of scenes with superimposed bands encoding characters, emotions, themes, and semantic fields, connected by arcs representing similarities between scenes. A character matrix displays each figure’s presence throughout the story, their arc, transformations, and relationships with network graphs integrated into each cell. A semantic map organizes thematic clusters in nested treemaps where the size of each block reflects its weight in the overall script. A conflict and structural patterns panel identifies the central conflict type, narrative stakes, and rhythm patterns with their scene-by-scene distribution. And a three-dimensional navigation space situates each scene on axes representing its position in the story, its emotional tension, and its dramatic complexity, producing a topography of the script that makes its internal architecture visible in a way that no linear reading can offer.

The project is part of a research field that Franco Moretti formulated as «distant reading» and which in the audiovisual field has taken form under the concept of «distant viewing,» developed by Arnold and Tilton as a methodology for studying large corpora of images and sequences. The central idea is the same in both cases: distant reading does not replace close interpretation but precedes and guides it, revealing structures that conventional reading cannot detect because it operates scene by scene, script by script. The research connects with that of Labov and Waletzky on narrative grammar, Greimas’s work on actantial structures, and computational studies on patterns of emotional tension and narrative arcs across extensive literary corpora. In the specific field of cinematographic scripts, researchers like Scott Enderle and the Digital Humanities group at the University of Antwerp have applied similar techniques to detect generic conventions and recurring dramatic structures.

In the context of script residencies, this raises questions with practical and ideological consequences. What themes dominate the projects an institution selects? What types of narrative arcs prevail? What patterns of gender, space, and time are repeated? Computational analysis converts these questions into observable magnitudes without pretending to resolve them. The tool is an instrument of interrogation, not of evaluation.
The tool is under active construction. The visualization modules advance in parallel with the refinement of the semantic analysis system, and the multi-user architecture was designed from the beginning to accommodate different institutions with different corpora.
Application (requires authentication): https://screenplay-analytics-production.up.railway.app/