Colin Brooks ↰

Experimenting with what works

Drawing of anthropomorphic A and B holding hands while a W watches on at a distance.

A lot has changed about whitney.org over the last year. This includes the entire platform underpinning the site, and a number of major usability and user interface improvements, from reworked navigation, to new mobile experiences for audio and video, to the incorporation of outside voices in our exhibition content. And with growing distance from the complexities of launching a new website, our data-related work has been picking up steam as we’ve been able to devote more time and mindshare to it, which in turn has begun to more deeply impact our design thinking and decision making processes. A major aspect of that impact is in an increasing ability to reevaluate our assumptions, and to better understand how visitors are actually interacting with us…rather than just how we might think they are.

In the beginning…

With whitney.org being rebuilt from the ground up, there was the opportunity (and requirement) to reconsider what data we ought to be collecting and analyzing through Google Analytics (GA) and similar tools. I joined the Whitney Digital Media team part way through this project, and so initially my primary goal was simply to get a handle on what was being tracked, and where we had major gaps.

Previously I had interned with the Museum of Modern Art on a data analytics and ingestion project and I hoped to build on that effort at the Whitney. Google Analytics can give you a lot just from incorporating the tag on your website, but to really understand our audience we needed to make sure we were also tracking important interactions, like use of our navigation, audio and video plays, and various call-to-action (CTA) clicks that wouldn’t necessarily be visible in the default pool of data. Without that tracking, we wouldn’t have a baseline to compare to when we were ready to start considering more major design and structural changes. Looking forward meant getting our house in order, today.

In addition to putting together the pieces for an interaction-baseline in Google Analytics, I was also interested in reviewing how accurate our GA-reported ecommerce numbers were compared to the organization’s internal accounting. This necessitated reaching out to colleagues in other departments. And while those conversations were brief, they have been incredibly important in framing our work. Seeing how our numbers compare to their numbers gives context to the fuzzy reporting of Google Analytics, and helps us to know what of level of confidence to ascribe to our figures…which is vital to separating the signal from the noise.

From passive data to A/B testing

The next stage in the Whitney’s digital data lifecycle has been to start investigating why our numbers are what they are. It’s one thing to know that we got 100 clicks on a button last week, it’s another to tease apart why it was 100 and not 50, or 200, or 10,000. To that end, we’ve begun to evaluate more and more of our core digital content through A/B testing. Altering certain elements of pages and considering those changes in relation to basic metrics like session duration, pageviews, and ecommerce transactions has been hugely helpful in terms of figuring out what aspects of our design and layout are effective at achieving their intended purpose. It has also been a gradual process, characterized by a growing willingness to experiment.

We began by making small changes to individual elements using Google Optimize: experimenting with the color of a few CTA elements, shifting the placement of a few links (and adding others), all in the hope of driving more conversions. As we became more familiar with the process of building and running tests, we started experimenting with larger aspects of the site, including a wholesale replacement of the footer, and a number of tests designed to start teasing out the effects of utilizing different kinds of content to promote exhibitions (or broadly, the institution) on our homepage. In the case of the latter, we’ve varyingly given users videos of artworks, images of artworks, experiential videos of exhibition installations, and images of people in those exhibitions.

The question of what content works is incredibly important to us as the museum’s Digital Media department. It’s our job to determine how to best reach new and returning audiences, and a major driver of our ability to do that is the media we choose to forefront. A/B testing is one more tool we can use to help make those decisions.

The conventional web wisdom tends to be that video will result in higher user engagement than still images. Video is also inherently more expensive than images, so the choice to go with one over the other doesn’t come without a few tradeoffs. In the effort to set a baseline for our both our data and any data-driven strategy, we wanted to test the conventional wisdom against our specific circumstances as a mission-driven arts institution. Given the added cost, we wanted to be sure that we understood the implications of each approach, and the metrics they affect.

And while it’s still too early to say what the definitive value difference may or may not be for us, the initial results have been interesting, and continue to warrant further investigation, and suggest new avenues to explore for homepage content.

Incorporating user feedback

Closely connected with our push into A/B testing, we’ve been working to get more qualitative feedback on various aspects of our platforms. This has included both quick in-person terminology surveys, and an incredibly rigorous two-part in-depth usability study of whitney.org run by an outside consulting firm. Both kinds of testing have been extremely useful, while also demonstrating why it can be worth it ($$$) to work with experts in the field. In-house user testing has made a lot of sense for us as small and focused endeavors, while the external usability study brought a level of expertise and specialized tooling we would never otherwise have had access to. There’s much more to say on both, but that deserves its own post.

It’s worth mentioning that all of our user testing efforts at some level remind us of what we aren’t doing as well as we’d like. And while it would be easy to discouraged by that, I think it’s important not to be. There is no end state in usability work. There is no point at which everyone will convert, everyone will understand our UI, and everyone will have a positive experience the first time, and every time after. Rather our goal is the constant evolution of our products, reflecting the changing needs and expectations of our audience.

T H E F U T U R E, and moving to product development in tandem with evaluation

So what comes next for our data efforts? Likely, it’s wrapping together all the helpful experiences we’ve had over the last year into a more consistent, repeatable process. In effect, moving to something akin to usertest driven development, where new features are planned alongside user-focused evaluatory measures. Whether that means A/B testing aspects of design or functionality, or small scale in-house user surveys, or even larger external evaluation, our actions should depend on the nature of the feature and be determined in advance, rather than only as a post-launch reactionary measure.

TL;DR

  1. Just because we measured something doesn’t mean we measured what we thought we did.
  2. User testing is useful in all kinds of forms.
  3. User-test driven development is good development.

[Read this on Medium]

How Many Hoppers?

An up to date count of the number of works by Edward Hopper currently on display at the Whitney Museum of American Art.

“Hey could you give me the numbers on that again?”

Interning with MoMA’s Digital Media department is great. Building a dashboard to track metrics across the museum is really complicated. Both of these statements are true, which is the kind of authoritative certainty I strive for in data analytics. This summer my job has been to create a dynamic dashboard that pulls in data from sources all over MoMA, that ideally updates automatically.

On its face, building a dashboard might sound like a straightforward task, composed of a) collecting the data, and b) visualizing it. But in practice it’s been much more complicated, requiring a significant amount of both human and technical collaboration. Figuring out where various metrics are coming from, if they are what they should be, and how to wrangle them into a dashboard requires answering a lot of questions with often imperfect solutions.

Part 1: Data audit — “What’s out there?”

Being new and legitimately having no idea how MoMA kept track of anything, there was no reasonable way to start this project without first talking to the people who knew their data best. MoMA had already identified the types of metrics it wanted to track, but the nitty gritty of how to generate them accurately and efficiently was something I needed to sort out. I began by sitting down with individuals from different departments and asking about what metrics they were interested in, what services they used, what their workflow looked like, what things worked well, and what things they’d like to do differently.

Having these conversations with people from retail, social media, email marketing, membership, management information, IT, collections technology, and others, was vital to the project for two primary reasons:

  1. There would be no technical way to create the metrics I needed without knowing where they had to come from.
  2. Without these conversations I wouldn’t be able to even start to answer the sorts of questions that have come up constantly throughout this project, and I’d be sending a dozen emails a day asking how numbers were calculated.

What those early conversations allowed me to do was be more efficient when I did have to bother people. They also meant that I had a better idea of what was possible to track, and critically at what point in the data-flow I should try to track from.

By the way, when you think you’re done you’re with this you’re not: something else will always come up.

Phase 2: Get the data — or SaaS (Spreadsheets as a Solution)

Of course the obvious question for all of this is how do you get data into the dashboard in the first place? How do you marshal it all together in a consistent and interoperable enough form where that’s possible? It’s a lot easier said than done when you start reaching dozens of separate services, and feel your brain start to melt thinking about how to combine your InstagramTumblrTwitterFacebookYouTubePeriscope, and email subscriber engagements segmented by month, post, impression, or number of followers.

“Data” is great, but the practice of getting it cleanly, consistently, and quickly for multiple people can be a real headache when it’s spread so widely. This is part of the impetus for MoMA to build a unified data-dashboard. To create a single place where KPIs and other important metrics can be observed by multiple people, over time. Where there is enough information to be useful at a high level even if it isn’t necessarily the absolute truth, and the reasons for any inaccuracies are clearly documented.

It’s important to point out that this has been by far the longest step in the process. The state of data on the web in 2016 is one of APIs and a gauntlet of new vendor-specific dashboards you need to login to once a month, remind yourself where they hid the export button and date selection, and dump the data to a spreadsheet. For a place like MoMA this presents a kind of sweet spot of organizational difficulty: enough data and the savvy to utilize it within its channel, but enough distinct data sources that it becomes a monumental task to assemble everything into a bigger aggregated picture of what’s going on.

Grid of 12 screenshots of various dashboards and data sources.
A small smattering of MoMA data sources. There are dozens more.

It’s great that (in a vacuum) so many of the museum’s data sources provide access to their content or analytics online. But as the number of services MoMA and other similar institutions use continues to creep upwards, these sorts of silo’ed check-each-service solutions start becoming really unwieldy.

I think it can be tempting, having reached this point, to do one of the following:

  1. Buy a service that sells itself on integrating all your other services. This will inevitably pose its own complications, and will require yet another service in the future to better integrate the integrating services.
  2. Develop something in-house that will require highly technical (and expensive) maintenance for the duration of its existence.
  3. Say screw it to the whole thing and stick with your current inefficient workflow.

What I’ve tried to do with MoMA is split the difference between those choices, hopefully taking some of the benefits of each.

In my mind the core problem in sharing data across any like-minded medium-sized institution is making the data available at a level that is accessible from a range of technical perspectives. And I mean that both with respect to the technologies themselves and the expertise of the people utilizing them. It might be great to break your social media data out in a SQL database, or to say hey, here’s the API keys, have at it. But that doesn’t work for everyone’s level of technical know-how, even if it might work with your Tableau or Qlik integration. Building something that will last requires building something that can evolve alongside needs and expectations. So rather than focus on creating the perfect dashboard, I’ve worked to get MoMA’s data feeding into spreadsheets.

That may sound a little antiquated, but the reality is that spreadsheets are already in just about everyone’s workflow who deals with data. Getting data into Google Sheets means that nearly anyone can do something with it, from charting within the sheet, to pivot tabling it somewhere else, to importing it into just about any other data-dealing software imaginable (including in MoMA’s case, Tableau). However, what makes Google Sheets in particular an appropriate tool for building consistency across sources, is Google Apps Script.

Google Apps Script allows you with minimal fuss to write JavaScript that interacts with a corresponding Google Sheet. In turn that allows for the possibility of connecting to the APIs of the various data sources you’re trying to do something with. For MoMA that means things like our ticketing system, email service, social media channels, and our app downloads tracking service among others can all be setup such that each night a Google Apps Script goes out to each API, requests data from the previous day, and then inserts that back into a Google Sheet. And with Google Apps Script triggers, that happens entirely automatically with no human input. From there it can be brought in to visualization software like Tableau or Google Data Studio, or simply shared to someone else and they can do whatever they in turn want to do with it.

Admittedly, “here’s the solution, just write some JavaScript and couple it all to a dashboard in Tableau” is itself fairly technical, but on the scale of what it could be I think this is reasonable. Google provides sample code for a variety of services that without too much trouble that can be massaged into connections to others. And on a personal level, I think the reality is that as more of the systems we interact with shift to RESTful APIs this kind of institutional knowledge will have to grow.

What this sort of middle ground of technical automation and integration allows for is an easy source of consistent data. With proper documentation, this opens up metrics to more people without having to deal with access to the specific services themselves, and allows for them to answer more of their own data questions. The technical cost is in setting up the API connections in Google Apps Script, but once that’s done there should be much less upkeep necessary than for a more end-to-end from-source-to-viz solution.

TL;DR

The vast majority of the work in building a dashboard for MoMA’s digital properties has been in figuring out why the numbers are what they are, and how to get them into a consistent and useable format. Updating them automatically on a schedule has been icing on the cake. This is a project that has taken me across the museum and a good chunk of the internet. And it’s become abundantly clear that there is no substitute for personal conversations.

I’ll end with a few top tips for anyone undertaking a similar journey through data:

  1. Talk to people.
  2. Don’t assume you’ll remember what someone patiently explained to you; please write it down.
  3. Accept imperfection. Lots of it.

And if I could give those tips a second time, I would do that too.

[Read this on Medium]

Neuron

Mixed media enlargement of an imagined neuron, incorporating generative audio and video, projection mapping with Kinect and OpenCV, chicken wire, stainless steel, and acrylic.

Recursion lesson & survey

An experiment in teaching about the concept of recursion to new audiences using an animated Sierpinski triangle. Learners read a brief sequence of paragraphs describing recursion while the animation played alongside, then answered a brief survey on the topic.

Spheres II

Continued exploration of deconstructing spheres, limited color, galaxy-inspired movements and shapes, and generative growth.

Whitney Teens Open Studio sign in

A smarter sign in form for Whitney Museum teen events. Parses acronyms and commonly mis-entered information, aggregates data, and pushes it to a central Google Sheet.

Deconstructing Shells II

Early exploration of deconstructing spheres, galaxy-inspired movements and shapes, and generative growth.