Rattenbury et al. 2017 (†871)Rattenbury, Tye, Joseph M. Hellerstein, Jeffrey Heer, Sean Kandel, and Connor Carreras. Principles of Data Wrangling: Practical Techniques for Data Preparation (O'Reilly, 2017).
- data wrangling (p. ix): The phrase data wrangling, born in the modern context of agile analytics, is meant to describe the lion’s share of the time people spend working with data. . . . 50 to 80 percent of an analyst’s time is spent wrangling data to get it to the point at which this kind of analysis is possible. Not only does data wrangling consume most of an analyst’s workday, it also represents much of the analyst’s professional process: it captures activities like understanding what data is available; choosing what data to use and at what level of detail; understanding how to meaningfully combine multiple sources of data; and deciding how to distill the results to a size and shape that can drive downstream analysis. . . . and in the context of agile analytics, these activities also capture the creative and scientific intuition of the analyst, which can dictate different decisions for each use case and data source. (†2616)