Wednesday, June 28, 2017

Towards an interactive data seeking and research model #i3rgu

I'm talking in the next session at the i3 conference at RGU in Aberdeen so, apologies, this will likely be a brief liveblog. Professor Gobinda Chowdhury (Northumbria University) presented a keynote on an interactive data seeking and retrieval model.
He started by talking about the pressures and rule changing that goes on the research field. One challenge he mentioned was the open access era - talking about both the growing requirement to make primary research data open access, and open access publishing. He saw a corresponding growth in the data management role of librarians. Chowdhury framed data as "the new gold (or oil)" (mentioning Uber, Amazon etc.), and pondered on the difference between information and data. He saw metadata and strings, words and phrases, as being the key elements in information - as opposed to data - retrieval. For data, people normally need to use other tools and systems to make sense of the data. The numbers alone will not have meaning without the context, e.g. the names of the variables, and the contextual knowledge of what lies behind them. He also identified specific characteristics of research data.
Chowdhury mentioned results from a survey in 3 countries asking about types of research data and what is done with it. He also reflected on how one might access research data: one example is a link from a published article. In that case you might get the context from the article. However, often you may be accessing data without that context. This is an issue with many of the searchable research data repositories: you also need to know a lot about the characteristics of the data before you can effectively search for it. Additionally, there may be potentially a lot of metadata you could add to your research data (e.g.the JISC repository profile has loads of fields) but few researchers may actually provide all this metadata. 30% of the people in their data research survey were unfamiliar with metadata and 37% did not assign tags to their data sets. Chowdhury also returned to the point that it was difficult to tell whether data was going to be useful until you had downloaded and explored it. The speaker identified a training gap in terms of understanding effective data sharing behaviour.
The international survey also identified challenges to data sharing e.g. Legal and ethical concerns, concerns about misinterpretation of data. Chowdhury then presented his model - interestingly it included the researchers themselves,the community culture, and the nature of the discipline, project, process, resources and (importantly) incentives (and not just metadata, indexes etc.)
Final points included that currently there was a focus on data discovery rather than data access, replicating traditional information retrieval systems. Key challenges focused on making sense, and using, the research data, and accessing research data.

