Categories
Technology Thoughts

Privacy Nutrition Labels for the Top Apps of 2020

With the release of iOS and iPadOS 14.3, all app updates in the App Store are now required to include Privacy Details, or “nutrition labels”.

App Privacy Labels

At a high level, there are three categories of nutrition label:

  • Data Used to Track You
    • “May be used to track you across apps and websites owned by other companies”
  • Data Linked to You
    • “May be collected and linked to your identity”
  • Data Not Linked to You
    • “May be collected but it is not linked to your identity”

Within each category, there is additional info split into types of data collected and ways data is used.

Types of data an app can collect includes:

  • contact info
  • health & fitness
  • financial info
  • location
  • sensitive info
  • contacts
  • user content
  • browsing history
  • search history
  • identifiers
  • purchases
  • usage data
  • diagnostics
  • other data

Ways data is used include:

  • third-party advertising
  • developer’s advertising or marketing
  • analytics
  • product personalization
  • app functionality
  • other purposes
App Privacy 
See Details 
The developer, Zoom, indicated that the app's privacy practices may 
include handling of data as described below. For more information, see 
the developer's privacy policy. 
Data Linked to You 
The following data may be collected and linked to your identity: 
Location 
o 
Contact Info 
User Content 
Identifiers 
Usage Data 
Diagnostics 
Privacy practices may vary, for example, based on the features you use 
or your age. Learn More
Zoom Privacy Details – apps.apple.com

Putting it all together, when looking at an app in the store, like Zoom for example, you can see the app collects your location, contact info, user content, identifiers, usage data, and diagnostics and links the data to you. If this data was in the “not linked to you” category, the data would still be collected, but done so anonymously.

The top level information tells you what data the app collects, but to see how the data is used, you need to select the “See Details” link at the top right of the App Privacy section.

From the expanded view, you can see that Zoom collects data for advertising & marketing, analytics, and general app functionality. This may look like a lot, but Zoom’s data use is comparatively short. Details for Facebook’s data use scroll for days.

And the distinction between data collection and data use is important. For example, an app may collect your location and use it to tell you the weather nearby. Granting permission to location would make sense if you are downloading a weather app. But an app may also collect your location and use it to tell ad providers all the places you go. In this case, giving access to your location would be sketchy if you were downloading a calculator app.

There is also an inherent level of trust associated with Apple’s new model for privacy details, as for app developers:

“You’re responsible for keeping your responses accurate and up to date.”

This means, to apply these new privacy labels, app developers must self report their data use when submitting updates to the app store. Apple does not read through all the code or monitor network traffic to automatically create an app’s privacy details. 

Apps can change their behavior with any update, but developers are required to update on their own. App reviewers do not flag when the privacy details need an update.

So while the longevity and robustness of the new privacy nutrition labels remains to be seen, we can take a look at how the most popular apps of 2020 report their privacy nutrition details.

Top 2020 Apps

If you have updated to iOS 14.3, it’s interesting to flip through some of the apps you use to see how they report their data collection and use. Although, it’s not exactly easy to compare two apps.

Since Apple recently unveiled the top games and apps of 2020, you can look at all the privacy nutrition label details in search of trends from the apps everyone are using.

So I did. And compiled the Privacy Nutrition Label Data for the Top Apps of 2020.

This starts off with general info regarding what data is collected, then looks at how specific apps and games report data use, and finally lists insights and questions from the investigation. (All the spreadsheets and data are included at the end).

Nutrition Label Data

General statistics
  • 80 total apps
    • 20 free apps
    • 20 paid apps
    • 20 free games
    • 20 paid games
  • 51 updated to report privacy data
    • 32 apps
    • 19 games
  • Top collected data types across all three categories
    • identifiers (70)
    • usage data (70)
    • diagnostics (59)
    • purchases (46)
    • location (42)
    • user content (36)
    • contact info (35)
    • other data (21)
    • search history (16)
    • contacts (14)
    • financial info (12)
    • browsing history (11)
    • sensitive info (7)
    • health and fitness (6)
  • Top collected data types (used to track you)
    • identifiers (27)
    • usage data (23)
    • purchases (12)
    • contact info (10)
    • diagnostics (10)
    • location (10)
    • other data (8)
    • user content (4)
    • browsing history (3)
    • contacts (1)
    • financial info (1)
    • health and fitness (1)
    • search history (1)
    • sensitive info (1)
  • Top collected data types (linked to you)
    • usage data (30)
    • identifiers (28)
    • diagnostics (26)
    • user content (24)
    • purchases (23)
    • location (22)
    • contact info (22)
    • search history (13)
    • contacts (12)
    • other data (11)
    • financial info (10)
    • browsing history (7)
    • health and fitness (4)
    • sensitive info (4)
  • Top collected data types (not linked to you)
    • diagnostics (23)
    • usage data (17)
    • identifiers (15)
    • purchases (11)
    • location (10)
    • user content (8)
    • contact info (3)
    • sensitive info (2)
    • search history (2)
    • other data (2)
    • health and fitness (1)
    • financial info (1)
    • contacts (1)
    • browsing history (1)
By Apps and Games
  • Most types of data collection (17)
    • Facebook
    • Instagram
    • Spotify
    • Twitter
  • No data collection (* these are all paid apps/games)
    • HotSchedules
    • AutoSleep Track Sleep on Watch
    • Shadowrocket
    • EpocCam Webcamera for Computer
    • Arcadia – Arcade Watch Games
  • Only collects data not linked to you
    • Widgetsmith
    • Among Us!
  • Most data types used to track you
    • Twitter (7)
    • Subway Surfers (6)
    • Spotify (5)
Free vs Paid
  • Average types of data collected (overall)
    • Free (10.5)
    • Paid (3.6)
  • Median types of data collected (overall)
    • Free (10)
    • Paid (4)
  • Average types of data (used to track you)
    • Free (2.9)
    • Paid (0.3)
  • Average types of data (linked to you)
    • Free (6.3)
    • Paid (1.1)
  • Average types of data (not linked to you)
    • Free (1.3)
    • Paid (2.2)

Insights and Questions

Many of these points stem from the descriptions of Types of data and Data use sections of Apple’s privacy details page.

Free apps
On Apple’s categories:
  • “Identifiers” is a vague name, but it’s related to device and user IDs. These types of IDs are often static and used to link your information across apps and services
  • “User content” from apps not creating user content is interesting (Disney Plus and Netflix). Guessing these are related to the “Customer Support” category.
    • And how does an app have “User Content” not linked to you?
  • “Purchases” is not included by Netflix (as you can’t subscribe in the app)
On companies:
  • Google hasn’t updated info for any of their apps yet
  • Widgetsmith was a breakout iOS 14 app of the year. It only collects anonymous purchase and diagnostic data.
  • WhatsApp is Facebook’s least offensive app.
  • What is Spotify doing with browsing history?
  • Twitter is doing a lot of tracking
On trends:
  • “Data linked to you” is largest category and shows most first party data use
    • “Data used to track you” is “owned by other companies”
  • Companies should move usage data and diagnostics collection from “linked” to “not linked” categories
    • Free games do a somewhat better job collecting anonymous data (but also use the same data types to track you)
  • Top free apps do less data sharing (tracking) than expected

Overall, rules are new, so companies are still getting used to the categories. Guessing they’ve over-reported as it is easier to move to a more private usage category. Companies may interpret rules differently (Twitter vs Facebook vs TikTok, why so different?)

Free games
Paid apps
  • Top paid apps do less tracking and data collection overall
    • Also have most non-updated apps in the top 2020 list
  • “Data Not Collected” is a tag (took going through a lot of apps to find that out…)
App Privacy 
The developer, HotSchedules, indicated that the app's privacy 
practices may include handling of data as described below. For more 
information, see the developer's privacy policy. 
Data Not Collected 
The developer does not collect any data from this app. 
Privacy practices may vary, for example, based on the features you use 
or your age. Learn More
Paid games
  • Very few top games have updated
  • Seems Facebook SDK could require Identifiers, location, usage data, diagnostics
Overall
  • Apple, what’s up with the random ordering of data types? Seems to be consistent by count, but not across all apps
  • Health and fitness apps were not very popular this year
  • How do changes to data collection and use get reported? Is there a notification added to the nutrition label?

Wrap up

Probably can do a lot more analysis on all this data, but it’s the holidays and everyone is asking me why I’m working. So I’ll leave it at that. As more apps update with their privacy nutrition details, we can expect to learn more about about how the apps we use use our data, and how Apple’s new system changes with time.

Charts and Graphs

Here is all the raw data if you want to compare: Top 2020 Apps – Privacy Summary

☃️ 🛷 ❄️

Categories
Thoughts Travel

Learnings From My First Conference Talk

This past Tuesday I gave my first conference talk at View Source in Amsterdam! It was an awesome experience at an amazing venue in a rainy city where people from all corners of the web came together to discuss many of the challenges, opportunities, and learnings for browsers, web development and the overall landscape of the internet.

I work on creating experiences to help people stay safe and have greater privacy online, so it was enlightening to hear from such a wide range of topics about the web. I’m always impressed by the depth of understanding and passion people have about their subjects of work, and the speakers and attendees at View Source carried an overwhelming amount of inspiration.

Just to name a few, gaming, entertainment, monetization, accessibility, connectivity, and rethinking digital utopianism were all covered. I love hearing about what people are working on. It shows how there is so much to think about and is a humbling reminder that my work is a small piece of a vibrant community.

I was fortunate to attend the conference with a group of us from the Microsoft Edge team. It was a great team bonding experience to get to know others from different parts of the team who I don’t normally work with. While it’s not always possible, I would highly recommend going to conferences with folks from your team. It’s great to have others with a similar frame of reference to talk about new ideas and to be more connected when you get back to work.

My colleague Lillian Kravitz and I spoke about the privacy principles we’ve developed for Edge. Melanie Richards gave a talk about the simple and actionable steps to help make your site accessible to everyone by considering of various contrast and theme settings, and others on the team held “conversation corner” discussions about web compatibility and more. The talks were recorded, and I’ll post a link here when it’s available. (Here it is! And me tweeting about the talk.)

A main theme of our privacy talk was listening, learning, and trying to gain a fresh perspective on a topic we thought we were familiar with. I know I am not at all familiar with giving talks on a big stage, but the aspect of learning something new and having a different perspective on presenting my work still felt as fitting to the process of giving the talk as it did to the contents of the talk itself.

I can come back to more about the talk when the recording is posted, but for now, while the experience is still fresh in my mind, I wanted to reflect on the things I learned, what went well, and what I could improve for next time. Because, yes, giving a talk is exhilarating and this one will not be my last.

IMG_9365

Preparing

Our talk was second to last on the last day of the conference. It’s tough having a time slot late in the day on a later day of a conference (this post and comments came to mind when I learned of our time). You almost need to leave something small to clean up and keep working on during the conference because if you show up on day 1 ready to go, you’ll have to keep your excitement and preparedness high for quite a while.

It would be great to be at peak preparation the night before the talk, but even then, we ended up waiting 8 hours the day of as our talk was at 5pm and the events started at 9am. At breakfast the morning of, excitement needs to be reserved because adrenaline could give out well before the talk. I likened the situation to an athlete or musician where a game or performance is late at night (worth looking more into how they manage energy). You need you energy and focus to be up at an hour different than your normal operating schedule.

Which leads to another interesting aspect of this conference. Traveling to a different time zone can be debilitating for the first few days. Especially when it’s many hours different than you’re used to (And seemingly more-so when going east around the globe?).

I am not one to take naps normally, but when your schedule is turned upside down, naps can be your friend.

Luckily the hotel was nearby the conference theater, so it was easy to go back to sleep. I was conflicted because I wanted to listen to all the talks, but I knew if I wanted to have the energy for my talk, I’d need sleep a bit before we were up.

My pre-talk routine (but maybe not a routine because I only did it once), was check the slides early in the morning before the first talk, listen to the first few talks, go for a nap, head back for lunch, listen to more talks (three hours before ours), regroup for a bit just before getting mic’ed up, the go on stage. Seemed fine. I think the whole process would have been easier in my normal time zone, but this helped manage energy and focus well enough.

The talk

It’s impossible to even scratch the surface of all you need to know going into something you’ve never done before. You have to put yourself out there and figure things out as you go.

There’s a lot of “tribal speaker knowledge” I learned from this first talk. Questions I hadn’t considered asking because they didn’t even come to mind before, and issues I could have mitigated had I known a bit more about the process. All good takeaways though. Makes me want to try again soon to test out my new perspective.

First, I think I was a little too reliant on my slide notes. I wanted to be sure to hit the speaking points we planned, but the talk felt less conversational as a result. The story we were going for lent itself to a more prescription presentation style, as we were sharing a process others might be able to apply, but I enjoyed the more casual and friendly sounding style of some other presenters that was more akin to giving a well thought out answer to a question rather than reading a speech.

Awareness of my over reliance on notes cropped up when, under some unforeseen circumstances, a few of my notes got cut off from the presenter screen. Without the expected cue, I stumbled a bit to keep with the flow I’d practiced when leading from an idea on one slide to the next. This was unfortunate because we checked the presenter screens before the talk, I just missed the few slides that had issues.

But when things don’t go according to plan, you’ve got to improvise! You can’t do a dance and walk off stage. You have to keep going!

Second was a simple problem of struggling with the clicker having issues advancing slides. At one point I thought I was ahead of where I was only to realize I missed a slide. (Sorry folks, that one image transition really made the talk 🙃).

After the talk when we went backstage to the “green room” talking about how it went, in an eye opening detail to me, another presenter mentioned that before his talk he asked the AV team where to point the clicker. I hadn’t even considered doing that. I figured the thing would just work (and I really think it just should), but for such a simple, yet crucial piece of presentation consistency, it was important to understand. This was some tribal knowledge that one who had given talks might know from variance of venues and presentation setups, but for me, it had not even crossed my mind.

Overall though, I think we did well. We connected ideas from other talks in the conference about privacy, collaboration, and the future of the web, and presented our customer focus as a way to reframe thinking about developing experiences. We realized there is always more to learn, and listening to feedback to spur continuous improvement was a common theme encompassing our time at the conference.

So yeah, that was the talk. Lots to think about for next time, but mostly minor tweaks to smooth out delivery. It was a great start to what I am look forward to as the beginning of many more to come. I definitely have areas to improve, and am anxiously awaiting the recordings to come out to kick myself over all the little things I didn’t get quite right. But I’m not going to hark on the mistakes. I’m going to learn from them to make my next talk even better. Can’t wait.

Touristing

Oh, and I mentioned the talk was in Amsterdam!? How about a quick travel update to round out the trip.

Side note, I think the concept of being a tourist and trying to avoid touristy things is funny. Why try so hard? Just go, enjoy the culture, and have a good time!

Side side note, a couple weeks ago at an organized bike ride in Seattle, which I would consider a very local thing to do, I met a couple who traveled from Missouri (I think it was Missouri, can’t remember exactly) who were visiting specifically to do the bike ride. No idea how they found out about it, but I was amazed at their ability to be local tourists. Pretty cool.

Anyway, I really like Amsterdam. The bikes, canals, frites, stroopwaffles, and tiny red cars all come together into a bustling culture. People are friendly, even if I often misunderstand what’s said under a Dutch accent (a taxi driver asked me how long I had to wait for the ride, and I answered I would be returning to the US. Thought he asked where I was heading… Sorry!).

Amsterdam is the first country outside of USA and Canada I’ve now been to twice, and I would definitely go again. Here are some photos from the rainier and sunnier parts of quickly playing tourist while on a trip for work.

Categories
News Feed

Facebook Privacy Report from The New York Times

As Facebook is upending the journalism industry, the New York Times is continues their campaign of exposing Facebook’s questionable data use.

Summary from The Download via the MIT Technology Review

https://www.technologyreview.com/the-download/612642/facebook-gave-more-than-150-companies-special-access-to-your-data/

Categories
News Feed

Google transferred ownership of Duck.com to DuckDuckGo

This made quite the ruffle today when Google transferred the domain duck.com to the privacy focused search engine DuckDuckGo.

Google’s ownership of Duck.com was previously a source of frustration for DuckDuckGo, when it would redirect users to Google’s rival homepage instead of DuckDuckGo. Google kindly tried to clear up this confusion in July by adding a DuckDuckGo link to the page. Visiting Duck.com now redirects users straight to DuckDuckGo.

via The Verge

The best part is the previous page for duck.com

Categories
News Feed

Location Data Privacy in Apps

The New York Times released a report (with some fancy graphics) detailing location data use by apps for advertising, outside the main purpose of the app. Only 10 apps were covered in depth, but the findings reveal how some advertising companies aggregate location data from apps.

Categories
News Feed

What the Marriott Breach Says About Security

Your personal data is already stolen. Here’s what you need to be doing:

via Krebs on Security

 

Categories
News Feed

Twitter’s Important Updates

I opened Twitter today and was welcomed with a message about their updated Terms of Service and Privacy policy in time for GDPR.

Twitter is updating its Terms of Service and Privacy Policy to provide you with even more transparency into the data Twitter collects about you, how it’s used, and the controls you have over your personal data. These updates will take effect on May 25, 2018

Anyway, here’s the update and additional policy information for Twitter and Facebook.

Categories
Technology Thoughts

What we learned from Facebook this week

For all the talk with Facebook CEO Mark Zuckerberg in the US Senate and House this week, there was very little surprising content. We give consent to use the Facebook service, we upload images, write posts, and like articles. We have control at every step of our interaction to decide how much to share with Facebook and what we give the company is exactly what is given back to us in the data archive download tool. It’s shocking to see every interaction you’ve ever made on Facebook in one place, but there is nothing here we don’t expect. There is no post we didn’t make or image we didn’t take. Facebook remembers what we do on the service as long as we have an account.

But that doesn’t mean everything from the last week was old information.

What was clarified?

An important point Zuckerberg reiterated is that Facebook does not sell user data. This would be a silly business move because Facebook’s value to advertisers is in the uniqueness of its data. It is in Facebook’s best interests to keep it’s trove of data secure, as it requires advertisers to keep coming back. There’s no other place advertisers can go to get the same level of targeting.

Instead of selling data, Facebook actually collects all the details from every person “in the community” and compiles the best advertising opportunity for a given ad. Facebook assures advertisers their ad placement will reach the intended audience with the greatest possibility of interaction. It is this assurance that gives Facebook it’s gazillion dollar market cap.

The Cambridge Analytica case was different, but still Facebook never sold data. Instead, Cambridge Analytica got raw Facebook user data from an app developer who used a survey app to harvest data. In 2014, it was within Facebook terms for a 3rd party app developer to use the Facebook developer platform to collect just about all the information about you and all your friends ever entered onto the site.

Listen to Exponent episode 146 “Facebooks Real Mistake” (link at the end) for background on how Facebook’s past push to be a platform landed the company in this situation. The takeaway? Had Facebook realized it’s value as an ad network, the company would never have given the same level of data access in the first place.

This is why the current Facebook fiasco is not a data security breach, but a data privacy leak. Hackers did not break into Facebook systems to obtain user data, but a developer (which could have been anyone) used Facebook sanctioned tools to collect your information. Facebook has since locked down it’s platform to prevent such unrestricted access to user data, but it does not change the fact that massive amounts of user data left the platform seemingly without consent of its users. And yes, it’s true that by signing up you agreed to the terms that allowed developers to leverage the wide open API to gather profile information, but did you really know that was part of the agreement?

What was surprising and novel?

Did you check if your info was collected by Cambridge Analytica? Go ahead, I’ll wait ⌚😊

After you’ve read through your activity log and exported your data, take a minute and think about what stands out from the content (I think this tinfoil hat scandal is all a ploy to get us to go on Facebook even more. Feel free to finish reading in the meantime, the export takes a while). Once you get to the details, you can see the majority of the information came from you, but there is a small subset which reveals the inner working of the Facebook machine.

To put things in perspective, focus on your ad preferences and take a look at your ad demographics information. This is a window to the 9698 categories from the Senate hearing. Advertiser demographic is the result of running all our interactions on Facebook through a proprietary algorithm. Of all the information in the data archive, this piece is novel. We didn’t explicitly tell Facebook this information, but they determined it based on what we’ve done on the site.

This is why the Facebook hearing this week is only the tip of the iceberg. If we are concerned that Cambridge Analytica could sway an election with a slice of our data, what kind of power does Facebook have? Sure we didn’t entrust Cambridge Analytica with our data, but why does opting into a puppy video sharing service change our perception of possible psychological manipulation?

What does Facebook do with all our data? And what can they do?

We need greater transparency on how our data is used. I can control and know what I upload, but what happens with the data “I own” once it’s handed over?

When I upload a photo to Facebook, what algorithms are tuned as a result? How does the content of the photo affect ads I see?

WhatsApp communication is encrypted, so it’s private between those in the conversation, but in what way does Facebook link my WhatsApp, Instagram, Facebook accounts? I’ve logged into all three on the same device so they must know it’s the same person (even though I signed up for all three as separate users).

And what about activity coming from the same IP address or GPS location? Does Facebook correlate data of those physically closest to me, outside of our connections on it’s services? What about when I’m on Facebook but signed out?

The consumer facing fun part seems like a front for the stingy advertising business on the back end. What is the difference between the two? It’s telling that Zuckerberg doesn’t fully understand the difference (from questioning by Brian Schatz). From Facebook’s perspective, the “fun part” is the user feature set that drives advertising revenue. It’s the top of the funnel for all of Facebook’s algorithms and drives the companies valuation.

For a platform that relies on its users to generate value, the company doesn’t provide much information to said users on how the internal cogs work. Perhaps it’s best to be blissfully unaware, or maybe it’s not a requirement, but when 2 billion people feel like the product and not the customer, it’s reasonable for them to want a little more information on how they’re being used.

And if this is Facebook, what about Google? (You can also export Google data)

What can you do to stay in control?

  1. Adjust log-in behavior to prevent future data leaks
  2. Check permissions when using Facebook (or Google or any over service) to sign up for a new site. To keep the same convenience, sign up for a password manager like Dashlane or LastPass which can generate and remember a new login for each site you visit. This adds a layer of security to your accounts and removes the possibility of another Cambridge Analytica style data leak.
  3. Prevent cross site tracking
  4. Use a separate browser just for Facebook. Only log in to Facebook on that browser and do all your other web stuff in another. Or use extensions like Ghostery (which also tracks your trackers, so maybe just turn off the internet for the day…) or the Facebook Container for Firefox.
  5. Limit sharing data
  6. Just use Facebook less? Deactivate for a week and see how you feel. You can always reactivate.
    Go old school and use an rss reader.
    Stick with iMessage/FaceTime.
    This is always an option.

All sorts of links

Video of Zuckerberg’s Senate hearing (transcript) and appearance before House committee (transcript)
Day 2 from MIT Technology Review
What was Facebook Thinking by James Allworth
The Facebook Current and The Facebook Brand from Stratechery
Facebook and Cambridge Analytica Explained from NYTimes
Facebook’s Real Mistake and Facebook Fatigue from Exponent Podcast
Mark Zuckerberg is Either Ignorant or Deliberately Misleading Congress from The Intercept
Mark Zuckerberg on Facebook’s hardest year, and what comes next from Vox
What is GDPR?
General Data Protection Regulation
Coachella streams 1, 2, and 3