Notes from the Library & Information Science Dissertations Conference, 5 November 2016 held at University College London (#lisdis16)
Information Science and the Business Analyst
Last year I returned to University to study Information Science. Unlike fellow student who were either starting out in their career or using the course to change career, sometimes dramatically, I was and still am a Business Analyst.
So, why do it? Why take a year out and invest time and money in a higher university degree in the middle of my BA career? And why go to Library School when there are plenty of other option in Systems Engineering or Business Schools?
Continue reading Information Science and the Business Analyst
On Writing Essays
Like many of my #citylis classmates I’ve spent much of the past 2 months writing essays, with the remaining time spent prevaricating or indulging in some festive fun. This is what scholars do. Our job is to read widely, appraise critically, synthesise astutely, ferment wisely, distil succinctly and communicate clearly. Perhaps this is why we are always going on about coffee and alcohol too; we recognise and appreciate other alchemic crafts. Each publication is our craft brew of ideas.
I know people generally think students are ill-disciplined wastrels. However, most of my colleagues are either working and studying part-time as part of their professional development or they are studying full-time as an career switching investment whilst juggling their family commitments. No-one wants to squander the privilege of being able to advance their knowledge, gain a qualification and have access to some of the best tutors and literature sources in the business.
Juggling the ideas, research and writing for four courses during the same period is a difficult mental challenge. It’s been hard work and downright stressful at times. Sometimes it comes easily, sometimes you think I’m never going to finish this. It took me a while to find my rhythm but now I’ve found a routine that works for me and once established have been disciplined about sticking to it. I work on average 5-6 hours a day, every day including weekends. That work includes information scanning, reading, writing and ‘busy work’ in that order. I run and do yoga to rest my brain and eyes and make sure my body doesn’t seize up from too much sitting at a desk. My guilty pleasure is watching Escape to the Country with a cup of tea in the afternoons. Most importably I rarely skip second breakfast.
One of the things I want to carry into Term 2 is that routine. Now that we’ve experienced a whole term cycle and its demands you get a better feel for the pace of work. Most classmates have been commentating on the perils of procrastination and the pressures of deadlines. One of the maddening things about research is the more you read the more you realise there is to read. Sometimes hours can pass and you realise all you have done is collect even more things to read. No actual reading. Certainly no writing. If only they offered a Masters of List Making. Next time of course it will all be different.
I doubt it.
We will likely still procrastinate. Sometimes it’s a genuine avoidance tactic; sometimes it’s what you might call a sorbet for the mind. A cleansing activity between more demanding work phases. Sometimes you need to step away and stop hammering at a problem to allow your sub-conscious the opportunity to ease the passage of stubborn thoughts. Sometimes it’s a sign of tiredness and a signal to take a break. Part of learning is understanding here are always areas for improvement to look for and thinking about how you will at least try to do some things slightly different next time.
One of mine would be to plan better to allow more contingency for ‘off hours’ and ‘off days’ to make sure I’m using energy wisely and not wasting too much attention and to lessen the despair when a session feels like no progress has been made. I now know how much effort on average it takes me to write an essay of around 3,000 words. I also know from my running training that periodisation is important and I organise my running into micro, meso and macro cycles to balance effort and optimise performance. I want to think across the whole term and plan a framework for my intellectual activities in the same way.
I also want to write more regularly. I found with writing essays the hardest part is always getting started. Sometimes you just have to sit and write and perhaps that will be easier if writing becomes a habit. Last term we had our DITA blogging to practice writing and I’ll continue to blog. To this I want to add more time spent writing reflective notes after reading important sources and putting more words in towards assignments more regularly.
I Love Scrivener (and Zotero and Todoist)
One of the things that’s really helped me with this is buying Scrivener and using it to collect my research, structure my ideas and write my early drafts for each writing project.
Early this term I want to explore Scrivener more to learn more about its features and take more advantage of it. I could improve how I combined it with Zotero for my references and understanding the formatting options so I don’t have to do so much final formatting manual. I’ve started a CityLIS template to store my settings and I’ll want to refine and enhance this before writing my next set of essays and the Big D.
Sitting down to organise my thoughts and start writing has definitely been easy with Scrivener and it joins Zotero and Evernote as an essential in my research toolkit.
I still love Zotero as my collect, curate and organise tool for research. I’ve started to think about how I can improve my Zotero habits to work better with the kind of research I’m doing now. I’ve been using collections more to support early assignment research and then selection of sources. I’m using their colour coded tags to indicate which resources are read, unread, to read and have been cited. I’m also using tags to record how deeply I’ve read a source. Have I done a quick skim (mostly), a deep read (some) or a full critical appraisal (hardly any unless you have to write an essay on such a thing).
Zotero is based on the idea of index cards (remediated practice!) so I’ve also started using standalone notes to capture concepts and definitions and put more effort into connecting related items to each other. Of course these kind of good habits do get neglected the closer you get to a deadline so now is a good time to try and get these habits embedded and tidy up both my Zotero library and my overflowing Evernote shoebox of interesting things I’ve saved.
Smaller Actionable Chunks
Another thing that really worked for me was breaking work into smaller tasks. This does help with procrastination and organising a schedule. ‘Write essay’ is a really had task to get going with. The activity is too vague, the reward to far away. Humans are just not psychological equipped to work with this. I’ve been using Todoist to manage my tasks for a while. One of the reasons is because it allows sub-tasks. Along with priorities, tagging and easy scheduling it’s really easy to organise both macro tasks and an action list of micro tasks to work through each day.
I’ve now got a template ‘Write Essay’ task for each essay structure that includes the main phases to work through to which I add specific tasks under that. For everything I want to read I add it as a Todoist task. For books I add each Chapter as a sub-task. Latterly, once I’d worked out my outline structure I started adding write section tasks underneath a write First Draft task. Write 250 words would also work. Yes it does take a bit of time to make everything actionable. I told you I would get a Masters in List Making.
Yet, there’s nothing that beats procrastination better than checking of a task. Bing! Stuff read. Bing! Stuff written. Yay! Todoist also adds a bit of gamification by giving you karma points and graphing your productivity trends. More than this I did find it comforting at the end of each work session to have a clear idea of what I was going to work on in the next session by adding priorities and scheduling tasks. Be realistic though if you look at your list of things to do today and it’s a dauntingly long then you and procrastination are just going to hang out a bit longer because where to you start with that?
Looking Ahead to Term 2
In Term 2 citylis core modules now vary between the MA/MSc in Library Science and the MSc in Information Science and there is one elective choice so our happy cohort will mix up a bit. this term I will be learning about Information Retrieval, Information Resources and Organisation, Information Domains and hopefully Data Visualisation once the electives are confirmed. I’ll continue to write more generally in this blog along the way and am hoping to extend my DITA blog to cover topics in Data Visualisation which will be my most techie Term 2 module.
In the meantime I’ve bought some new running shoes in the sales to celebrate getting to this point and I like forward to breaking those in. It’s time to enjoy a few days rest: drinking wine, sleeping, catching up on the news, tackling on the pile of domestic chores that have built up on the wayside and dipping into the pile of books I’ve accumulate that are in various states of ‘readness’ are all on the agenda. I also need to catch up on Last Tango in Halifax! Then onwards to Term 2 which begins on the 26th January.
Diving into Domains, Documents and Digital Ecosystems
CityLIS Term 1 Week 5. In which we dipped into domain analysis before going fully immersive; we practiced techniques for collecting and archiving tweets as a prelude for visualising and analysing them; intrepid citylisters took field trips to Highgate Cemetery (check out the DITA blogosphere for some interesting blogposts on this) and screened The Internet’s Own Boy; I investigated how big worlds can actually be quite small; we learnt about storing digital assets in repositories and what happens when you set them free; and we explored what makes good communication, (written and oral), and how to deal with the parts we find uncomfortable.
In DITA this week we explored issues around researching social media. Ernesto compared this to pinning butterflies. I find that metaphor makes me think more of capturing a specimen from the vortex of ideas this course unleashes and pinning it into my dissertation so I’m using the metaphor of catching waves instead. Forever rolling against the sands of time (and entropy) collecting and analysing social media feels like trying to map patterns in the shifting tides and waves that lap against our shores. So much of what we see is on the surface and ephemeral. This week’s session helped us venture into the deep. My submersible for this expedition was a Twitter API application I called DITA Venturi. I initially thought of this merely for it’s connotation with venturing but then I discovered the Venturi effect and realised I’d managed to quite aptly traverse from thermodynamics last week to fluid dynamics this week. Apparently the Venturi effect can convert pressure into suction and Venturi also invented a device for measuring flow through a pipe. Quite an apt analogy for sticking an application into the Twitter stream and trying to analyse it’s flow and extract it for posterity.
We learnt about two possible data transports for APIs: XML and JSON and noted that XML’s qualities make it more suited for documents whilst JSON’s simpler model of key value pairs and arrays make it good for small chunks of data. It is JSON that is presented by Twitter API endpoints and we then used Martin Hawksey’s TAGS google scripting to extract the results of a Twitter search into a Google spreadsheet using our Twitter applications. This provides a one off or ongoing capture of tweets and all the power of spreadsheet analytics for interrogating that twitter archive including provided summaries and graphs. Hawksey has also built some great visualisation tools that can be used to visualise the twitter archive in different ways such as TAGS Explorer (you can try this with the demo spreadsheet that is provided by default). This week’s DITA blog on putting this together isn’t due until after reading week so I’m going to wait until I’ve attended the British Library Labs Symposium on Monday and use #bl_labs as my case study.
This was all pretty cool and also beautiful. Data visualisation is spectacular and artistic. What I haven’t been able to make the leap to yet is what insight it gives. I can understand archiving tweets. The Twitter API only contains tweets from the previous 7 days and then it becomes much harder to access from within Twitter’s vast and commercially valuable data vaults. Capturing tweets provides a handy corpus that researchers can go back and consult but I cannot yet understand what TAG Explorer is telling me. What does data visualisation add and how to we approach using this corpus for meaningful research rather than just because it’s interesting? We will pick up where we left off after reading week so I look forward to finding out.
The Science of Small Worlds
It’s quite good that I’m behind on the University of Southampton’s Web Science Mooc (#FLwebsci) as this week’s topic of using network theory to analyse social networks really complemented DITA thinking. In this week we looked at network properties and scale free, small world networks … like the web. These are networks where most nodes have very few connections but a few notes, known as hubs, have huge numbers of connections. This network pattern makes even global networks ‘small’ because most nodes can be connected by a paths containing a small number of ‘hops’ between nodes. This is typically 6, leading to the phrase “six degrees of separation”. This video from PBS Nova explains how social networks look and how this pattern is replicated across many natural and human networks.
I watched this RSA Animate short on the Power of Networks provides a great visual accompaniment to an article by our tutor Lyn Robinson along with Mike Maguire on using the Tree and Rhizome and metaphors for patterns of information organisation. The tree view of knowledge classification comes from the Aristotle tradition of branching hierarchies: the rhizome was a term developed by philosophers Deleuze and Guattari to describe and organisation model based on a continual shifting set of connections between things. The tree is like a narrative, the rhizome is a map for a constantly shifting world.
Seeding Knowledge by Ceding Control
We had a preview of some aspects of the British Library’s experimental work that may feature at the British Library Labs Symposium on Monday in Information Management and Policy this week when James Baker from the British Library came to talk to us about his job as a Curator in Digital Research at the library. Digital Research is exploring digital collections beyond resource discovery to research at scale and lowering the barriers to digital researchers. the library’s legal deposit has been extended to UK published websites so the library can now archive born digital resources.
(1) Personal Lives: From Letters and Diaries to Computer Forensics
The implications for archiving with the transition from letters and personal correspondence to Digital Lives. The British Library is interested not just in content as received on computers but performing forensic analysis on hard disks to understand “the life of how someone interacts with the machine”. This raises data protection issues so hard to make this collection public.
(2) Infectious Texts
Combining text mining and close reading to map networks of re-printing in 19th-century newspapers and magazines (a kind of historical version of what we are doing in DITA with Twitter data).
This project over one million images from within 65,000 books digitised as part of the Microsoft Books project. Initially they were posted on Tumblr, then Twitter then the whole collection was loaded onto Flickr (with metadata also available on GitHub) under a CC Zero (public domain) licence.
| “We enjoyed losing control of the collection”
James listed some of the remixing and interactions: teaching (learning about curation), hacking, experiments, #immersive adaptions, incorporation into Wikimedia that the experiment has spawned. Using web infrastructure and UX “off the shelf” they were able to experiment with doing things it would be impossible or prohibitively expensive to do with BL systems.
Some Questions/Issues to Negotiate:
- Derived Data: what to do about data built on data, additional metadata and potentially incorrect data
- Remixed Collections: what happens when images are decontextualised
- Reintegration: incorporating user generated data back into BL collections
This made be think more about how the nature of collections and research may both change if digital collections become more open and extensive, connecting with some of our DITA themes.
We are Digital Makers: in a more participatory web architecture and culture we all have the opportunity to curate and create our own ideas and projects from raw digital material provided by libraries into the public domain.
Hacking research: uses of collection data outside ‘serious scholarship’:
- community cataloguing and classification
- machine learning
What is the “role of the curator?”
James is a curator and part of the experiments also involve thinking about how curation might evolve as a result.
| “How do we manage this dispersal?”
It sounded to me like seeding an ecosystem (by ceding control), a different and diverse role for a curator from the more traditional managing a collection. It made me think of Hans Rosling’s describing public data in his Ted Talk The Best Stats you’ve Ever Seen.
But this is what we would like to see, isn’t it? The publicly-funded data is down here. And we would like flowers to grow out on the Net.
James spoke of a spectrum of information control from authority and finality (an institutional mindset?) to adaptability and evolution (a hacker mindset?).
This raises further questions like:
- understanding and tackling the issues that arise when informations bridges different spheres
- what is the role of the library along this spectrum?
Thanks to James for coming along and sharing his insight and some of the British Library’s Digital Research ideas and experiments with them. you can take a look and James’ presentation on Slideshare.
In RECS this week we discussed communication both oral and written. This was an interactive, and humorous, session brimming with anecdotes and views on what makes good and bad writing and presenting. When I thought about this as preparation for this session I thought about people like Hans Rosling, Daniel Kahneman, Tony Judt, Roger Deakin, Geert Mak, Hilary Mantel and David Attenborough. I think of being absorbed by their calm authority and their skill in distilling complex subjects into clear, simple prose. They have the quiet confidence that those who don’t see will see. They dive beneath the froth and foaming waves at the surface and guide you into quieter, deeper territory towards something more profound. Like skilful divers they have mastered neutral buoyancy and have the balance, control, technical proficiency, knowledge and experience to achieve this equilibrium. More than individuals and their ability I thought of how good communication makes me feel. It is about transmitting the joy and awe of rising above and standing at the summit of a mountain seeing a vista clearly laid out before you as you have never seen it before.
Yet most of us find these skills difficult and uncomfortable. So this session was designed to help us explore and confront the good, the bad and the ugly. Afterwards I compared the discussion we had on the art of speaking and writing with ease with my constant attempts to improve as a runner and wrote myself some motivational guidelines that might help with both!
In LISF this week Lyn Robinson took us right to the cutting edge and spoke to us about her recent conference paper at Internet Librarian 2014 on immersive documents (see also her blog post) potentially a future development in the history of documents as we shift to an increasingly digital and multimedia world. Both immersion and submersion derive from the same Latin verb meaning to dip, soak or plunge. Immersive unreality refers to virtual worlds that are so real they are perceived as real. Lyn located this type of document emerging from the nexus of pervasive networked computers, multisensory multimedia and participatory interaction. At the moment this is most often tied to gaming of fan fiction but if this kind of transmedia document becomes more prevalent what are the implications for libraries and information centres. If the British Library is navigating the shift from letters to personal computers and book deposit to born digital and researchers are struggling to capture and interrogate social networks what on earth would a library or archive of immersive documents look like?
These are early days. There are no immersive documents yet but there are some great examples from fiction of what they might be and some interesting prototypes emerging e.g. The Craftsman. Immersive documents need new forms of creative writing and new forms of design for transmedia and for hardware, narrative form and content producers to converge (currently developing at different speeds). They also need to go through the technology adoption curve and make the leap from early adoption to mainstream use. Part of me remains suspicious that if you asked the majority to choose between passive and participatory they would choose passive.
This session did make me reminisce wonderfully about the Fighting Fantasy series of novels. Who didn’t read these without bookmarking the previous branch with your finger in case you’ve made a wrong turn? These were individually participatory and gave the reader some agency in determining the outcome through the branches. I guess we are back to the tree and the rhizome again: digital immersive documents probably offer much more in making this less a branching narrative and more an evolving narrative and also more real than leaving your fingers in three different places to check that your decision hasn’t made you dead yet so you can go back and explore an alternative story if you’ve been stupid.
There are going to be ethical and cultural issues if this form takes off:
- what are the privacy implications? Bad enough surveillance of activity and communication but now add performance, fantasy and dreams
- are stories define by the medium or do stories drive the medium?
- could you experience someone else’s experience or would context awlays get in the way
Some of the issues for LIS may include:
- are immersive experiences documents?
- indexing and versioning
- retrieval systems
- information interaction behaviour
- immersive literacy
Rest assured though. If it does come and you’ve studied at CityLIS you are going to be prepared!
- Data as culture: how will we live in a data driven society? from the Royal Statistical Society who are also hosting a talk on 19th November by David McCandeless on his book Knowledge is Beautiful
Not much Flânerie this week as I was busy setting up my new computer. Next week is Reading week so apart from heading to the British Library on Monday I’ll mostly be spending my week with my nose in a book, (or its digital equivalent), and thinking about upcoming assignments.
Featured image: Heading up through the bubbles by Saspotato. Source: Flickr. (CC BY-NC-SA 2.0)
From Running to Communicating
This week in our Research, Evaluation and Communications Skills class this week the emphasis was very much on communicating. We we asked to think about good and bad writing and presentation styles and think about our favourites writers and presenters. We were also asked to think about what we liked and disliked about writing and presenting. This was an interactive session with lots of great ideas and input. By the end of it I was thinking about how I could put together a motivational guide for myself based on the good advice discussed in class and by reflecting on my own previous practice, not just in writing and presenting, but also drawing on my training programmes from my time as a hockey player and now a runner and project work.
Beat prevarication by aiming low.
At the start you think it will be hard knowing what to put in: by the end you realise it is harder deciding what to leave out. Still that blank canvas is daunting.
One thing I’ve learnt from my running is it’s sometimes best not to think about the end, just think about the next step. The prospect of training enough to finish a race can be so nerve wracking it becomes dispiriting. Instead in my running I initially try and concentrate on why I run not how far or how fast I am running. I think about beautiful trails and fresh air; clearing my head and feeling energised and healthy. I tell myself to just get out and do a little bit every day. If I felt like stopping after 500m I could but at least I would have started. Once out I nearly always run further than I think I will but the key to running consistently for me is not to put pressure on myself by thinking a run isn’t worthwhile if it isn’t what I planned. Anything will help. When I join several runs together and train consistently I get fitter without even noticing and enjoy the process much more than when I focus on targets. Like running, writing and presenting are not just skills they are habits, and forming good habits is hard. The hardest part, however, is the first step. Once you’ve got going you have momentum so to get going I tell myself to sit down for each study session and write not much of anything to begin with and go from there. I always write more than I think I will.
Find an authentic voice by using a style that suits you and your audience.
Adapt your persona to suit your audience but be sure that persona is still true to you. There is no best way if doing it that way makes you or your audience feel uncomfortable. There is no bad way unless it distracts from what you are saying. As long as you are enthusiastic about what you are trying to say your audience will likely be engaged.
The same is true for running. Go running and you will see hundreds of people and hundreds of different styles. Some look incredibly uncomfortable, others look as though they are flying over the ground. You also can’t sprint a marathon or jog a sprint. There has recently been a trend towards minimalism and more ‘natural’ running styles in the running literature. This has come from the idea that there is a best way to run and it’s based on their way our ancient ancestors ran hundreds of years ago. It has become the new evangelism in running. It has led to a wealth of self-help guides encouraging people to adopt their running gait from heel strike to midfoot strike without there being much evidence that one is universally better than the other, It has also seen running shoe fashion move towards shoes with less cushioning and a lower to the ground structure. For many it might have brought them more strength, less injuries and better running. For others it has brought the opposite either because such a style doesn’t suit them or their running or they have attempted to transition too fast. You cannot go from one style to another in a single training cycle but it doesn’t stop people trying. So the literature is filled with more research and opinions for and against with the end conclusion usually being the best style is the one that works for you, by keeping you injury free and healthy, rather than prevailing fashion. Stay strong, be flexible, wear simple and comfortable shoes that don’t use too many gimmicks.
Build confidence by practicing regularly.
One of the reasons I am trying to blog more at the moment is because I know at the end of this year I will have to write a dissertation. A dissertation is maybe like a half marathon of writing so can’t be entered into lightly. It needs preparation and practice to even finish never mind do well. Preparation for a race will likely include following some kind of training plan that will aim to build fitness gradually over time using periodisation. This involves varying your training over long and short cycles and organising it into phases so start with shorter easier tasks and culminate in more race specific tasks before tapering towards the end so you will feel fit and fresh.
A typical training plan will include the following phases:
- Base (develop basic stamina and endurance)
- Build Up (increase strength and endurance)
- Peak (mix longer and faster sessions to develop all round intensive and extensive endurance)
- Taper (ease down on sessions so your body stays in good condition but is able to reap the benefits of training by having more recovery)
- Race (enjoy the results of all that preparation!)
Writing and presenting are kind of the same. They involve a period of researching and playing with ideas and notes. Maybe organising those thoughts into more of a structure and fleshing them out into a first or second draft. You’ll research the point where’ll you’ll need to start taking things out rather than going longer and spend some time away from the project before going back and reading it through and polishing it. Finally you will publish or present it.
This all becomes a lot easier if you do it regularly. Runners have maintenance phases so writing and speaking in front of an audience whenever possible will help find or maintain your communication rhythm and style between formal projects. This is one of the reasons I’ve started to blog more and write about the weeks: I’m hoping my communication ‘fitness’ will improve and things will be slightly easier for being familiar when more formal assignments come along. I also know from running that the sessions I find the hardest and most dread are often the ones that leave me feeling most exhilarated and motivated to continue afterwards. So just keep trying.
Let it Go .
Beat perfectionism by being agile
Simplicity – the art of maximizing the amount of work not done is essential.
One of my biggest problems is knowing what to leave out and when to stop. I don’t like to let ideas go so I squeeze them in until they have no room to breathe. It is difficult leaving out painstakingly excavated research and carefully crafted words by the wayside so I tinker … but less is often more.
I got better at changing my mindset so I could tame my inner perfectionist and accept good enough more often once I had worked on some agile software projects and learnt about iterations and definitions of done and thought about my work in terms of releases and continuous improvement and quality in terms of fitness for purpose. To author agile I: allocate effort, rapidly revise work,spend more time taking things out than putting things in, leave incomplete features on the back burner for future releases then stop. It may not be perfect but it will be fit for purpose. Most people most of the time won’t notice the difference between good enough and great and will forget about the bad but you’ll notice, and get exhausted by, how much more effort you have put in to achieve the finality you crave. So just ship it. Once it is done it cannot be redone or undone so by all means reflect but move on. Next time is waiting.
Entropy, APIs and the Public Record vs the Right to Privacy
CityLIS Term 1 Week 4. In which we move on from the history of documents to the relationship between information, the universe and everything; we play with the shift from a static, publishing web model (Web 1.0) to a service oriented, participatory web model (Web 2.0) by exploring web APIs and mashups; #citylis went to Internet Librarian 2014 (#ili2014), European Conference on Information Literacy 2014 (#ecil2014) and supported Open Access Week (#oaweek); we explored the tensions between freedoms of speech and information and data protections and the right to be forgotten; and we thought about ‘asking’ as research method.
Let’s Get Meta-Philo-Physical
After completing the history of documents Lyn Robinson turned to philosophy and as many sciences as she could throw at us in one afternoon to explore definitions of information, and the gaps between these definitions, across multiple domains. We covered Liebenau and Backhouse and their semiotic theory of levels in understanding information, Popper’s three worlds, Shannon’s 1948 Mathematical Theory of Communication, Professor Brian Cox on entropy and Sir Paul Nurse on Biology as organised systems of information. Not forgetting Luciano Floridi and his philosophy of information. The book chapter David Bawden and Lyn Robinson wrote on conceptualisation of information across domains is well worth a read.
“We are faced with two kinds of gaps: the gaps between the concepts of information in different domains; and the gap between those who believe that it is worth trying to bridge such gaps and those who believe that such attempts are, for the most part at least, doomed to fail.”
Robinson and Bawden (2013). Mind the Gap: Transitions Between Concepts of Information in Varied Domains
After being fairly comfortable with history this was fairly mindblowing – in a good way. We discussed information as difference (which I had to write down in three different ways to get my head around) and also information, entropy and the constant interplay of order and disorder. Is there more information in low order/high entropy systems, as Shannon argues, or is there more information in high order/low entropy systems?
It is counterintuitive to think that as the disorder and uncertainty around the arrangement of documents increases the amount of information increases. In LIS, we instinctively think that as order increases so does information. This may not be true. Findability may increase but this may not be the same as information.
Perhaps one of the compelling things about big data is the insight that comes from mining data that is more disordered than in a traditional database. Therefore, there is more to be uncovered about the possible arrangements of things within: hence being able to find more information using NoSQL techniques across a large unstructured corpus than using SQL techniques across a database ordered according to a particular scheme. Alternatively there is no information in big data until order has been found using complex algorithms and approaches (e.g MapReduce).
Blogging Mashup Mixtape Party
In the digital world I reached back towards my love of mixtapes to explore the present Web 2.0 possibilities for mashups by using open content, licensed for reuse, and web services. This was huge fun an involved creating Spotify playlists (including my mashup mixtape and cityLIS radio), Twitter widgets, watching Ted Talks, turning my websites into pictures based on human DNA, playing with WordPress shortcodes and sticking all of them together. Also discovering someone has hacked together a cassette player and tapes as a controller for Spotify playlists using Raspberry Pi. Very cool.
In fact, there were numerous music related API and mashup posts across the DITA blogosphere.
To Know and Forget
This week’s information management and policy session was on Information Law and there was a really interesting discussion about the issues arising from the European Court of Justice ruling (ECJ C–131/12) in the case of Google Spain SL and Google Inc. v Agencia Española de Protección de Datos (AEPD) and Mario Costeja González. This ruling allows individuals in Europe to request that Google remove links from search results to content about them published on the web as part of the European Data Protection Directive (95/46/EC).
“The internet has revolutionised our lives by removing technical and institutional barriers to dissemination and reception of information, and has created a platform for various information society services. These benefit consumers, undertakings and society at large. This has given rise to unprecedented circumstances in which a balance has to be struck between various fundamental rights, such as freedom of expression, freedom of information and freedom to conduct a business, on one hand, and protection of personal data and the privacy of individuals, on the other.”
European Court of Justice Opinion ECLI:EU:C:2013:424
Our discussions ranged over the practical issues, the various roles of publishers and information indexers and mediators, such as search engines, and the ethics and the debate in the public sphere is also ongoing as the many parties involved attempt to implement and digest the ruling.
The European Union has produce a Mythbuster and a Factsheet to help with interpretation.
Google publishes a transparency report on their impementation of the ruling and has also assembled an advisory council to guide it. The council holds a series of public meetings across Europe and invites contributions from members of the public.
Luciano Floridi, a member of the Google advisory council, popped up again with an article in The Guardian considering the right to be forgotten as an exercising of power over information that needs to be carefully considered.
Floridi argued that publishers should have more of a say, a sentiment echoed by the BBC and The Guardian with the BBC saying they will beging to maintain and publish a list of their content for which they have received removal notifications.
- Great visualisations of Information Geographies (Oxford Internet Institute)
- A zoo full of data visualisation techniques (Jeffrey Heer, Michael Bostock, and Vadim Ogievetsky)
- Connecting Otlet and Bush in The Secret History of Hypertext (The Atlantic)
Featured Image: time disappears by Travis Miller. Source: Flickr. (CC BY 2.0)
Needs to Knowledge Past and Future
CityLIS Term 1 Week 3. In which we completed the story of documents from the dawn of time to the present day and discovered everything connects; I found out how catalogue cards connect with the pre-history of the web; the Economist wrote about the Future of the Book and played with it’s form; we learnt about asking questions and finding answers using databases and information retrieval and knowledge management.
Inspired Library and Information Science Foundations (LISF) and the story of documents Part 3 this catalogue card shows us the use of classification schemes within a cataloguing code using a 20th century format, the index card. It also provides some additional user created metadata added to the official typed record. An added identifier is “the Lemur Book” referring to the animals that usually distinguish the cover of an O’Reilly book. We also see something written on that links into the information retrieval themes covered in the Digital Information Technologies and Architecture (DITA) information retrieval themes and the contextual siting of search around a seeker and their information context and needs: “What we find changes who we become”. This image itself was found by practicing information retrieval techniques from the DITA lab session.
Yes in this week’s LISF lecture we completed our history of the story of documents taking is from the enlightenment to the present day in the ongoing quest for bibliographic control over the world’s knowledge. This featured much coverage of the 19th century and Victorian pioneers who laid down such robust foundations for modern library and information science they are still the cornerstones of the discipline to this day. This includes intellectual tools such as catalogues, classification schemes and memory institutions such as the British Library and the public library network.
The Ancestry of the Web
These themes were reinforced in Week one of the FutureLearn MOOC Web Science: How the Web is Changing the World from the University of Southampton. I watched a Lecture (activity 1.10) by Professor Les Carr on the pre-history of the web. This discussed familiar territory now including Paul Otlet’s Mundanaeum and Vannevar Bush’s Memex. He spoke of the importance of the Mundanaeum not just as another attempt to collate the world’s knowledge but also stressed new intellectual tools: librarians, queries, and technologies: the index card.
“Query became part of the bibliographic record. Content was interlinked.” – Professor Les Carr
He also spoke about the 1937 idea by H.G Wells to use microfilm to capture all the world’s knowledge as The World Brain, a permanent encyclopaedia.
“There is no practical obstacle whatever now to the creation of an efficient index to all human, knowledge, ideas and achievement” – H.G. Wells
We then passed through the emergence of the internet, a network of network, inspired by the work of computer scientists such as Vint Cerf towards the emergence of the web. Despite this lineage from the attempts for bibliographic control and capturing all knowledge the web this wasn’t really the impetus for the web. The web was intended to solve information management problems at the CERN research lab in Geneva.
The web’s architecture contained three core ideas that realised and embedded interlinking and querying in the digital record:
- URIs/URLs – the idea that everything has a unique identifier
- HTTP – a mechanism for allowing clients and servers to communicate via the internet
- HTML – the ability to encode document structure and links to related documents in a simple markup language
From Geneva it expanded throughout the scientific research community and was then given to the world. As Tim Berners-Lee famously said: “This is for Everyone” and everyone took it and used it for new and different purposes extending the web into the information service we have today.
If you are not taking #FLwebsci yet register quickly and catch up before it closes. It’s a well put together course with great discussions going on as participants share their thoughts and experience.
The Future of the Book
Lyn’s whole epic narrative arc of documents from the ancient world through to the world wide web was also supplemented this week by an essay published in the Economist on the Future of the Book called From Papyrus to Pixels. The article itself is a fascinating read connecting books past, present and future and discussing the connections between formats, technologies, authors, readers and publishing business models to trace things that endure, things that may change and things that may fade and revive. For all that has changed the essence of the book as a route to pleasure and for encouraging connections between people and knowledge persists across millennia.
“Books will evolve online and off, and the definition of what counts as one will expand; the sense of the book as a fundamental channel of culture, flowing from past to future, will endure.” – The Economist. Future of the Book Essay. From Papyrus to Pixels.
Interestingly the essay is also provided in three formats: an audio version, an ink stained, coffee ringed skeuomorphic virtual book and a web page. It was noticeable when I first encountered this information presentation that my first thought was to call it a ‘traditional’ web page. I clearly thought using the web to deliver audio or digital reconstructions of a retro physical paper format to be more cutting edge. The web succeeds most when it takes what was best about old formats and technologies (codices, radio) and brings them them forward to the web creating richer ever more intricate and converged documents. I still find turning pages (even fake ones) more immersive and a two page layout in soothing black and white more engaging than scrolling through a long single column of text with brightly coloured images, headings and marginalia. How technically and conceptually clever of them to prompt such debate even before a word has been read.
Finding and Knowing
Over in our cityLIS digital world we covered databases, information retrieval and the precision of search engines. I had never paid such close attention to the practice of searching before. Perhaps I have become a lazy searcher carelessly tossing free text searches into the most obvious search box and uncritically accepting what comes. Thanks to this week’s lab I paid close attention to different types of information need, to different search methods for information retrieval, the precision and recall of different search engines and came up with some varying conclusions. This also came up in our research methods class where we were introduced to Cyril Cleverdon who was the first person to suggest formal testing of information retrieval systems and developed the measures of precision and recall as part of his investigation into the comparative efficiency of indexing systems.
Cleverdon is an entity in Google’s Knowledge Graph and bridging the gap between information needs and knowledge was another theme of the week. This connected into our Information Management and Policy lecture on Knowledge Management that was given by guest Lecturer Noeleen Schenk from Metataxis. In this session we covered some of the models, benefits, drivers, tools and challenges involved in managing knowledge within organisations.