Phenotypic datasets are increasingly rich and heterogeneous, with images, time courses, manual measurements, processed variables, and metadata. The management of such datasets navigates between partly incompatible objectives: (i) facilitate data analysis by extracting, organizing, and storing relevant variables; and (ii) allow reuse of raw, synthesized, and computed data (FAIR principles). For the first objective, ‘dedicated datasets’ can be extracted from raw information and tailored for the user’s data analysis, but they result in a massive loss of information. We advocate that, for the second objective, ‘sensu stricto phenomic datasets’, upstream of dedicated datasets, should organize data without loss of information with data-science tools, in a ‘theory-agnostic’ way. They allow different users to build their own ‘dedicated datasets’ according to planned data analysis.
Pommier, C., Alic, I., Cabrera-Bosquet, L., Draye, X., Neveu, P., Reif, J. C., Robbins, K. R., Krajewski, P., & Tardieu, F. (2025). Reassessing data management in increasingly complex phenotypic datasets. Trends in Plant Science, 12. https://doi.org/10.1016/j.tplants.2025.09.001 (Original work published 2025)