Amazon Insider


Course: Design for Artificial Intelligence Products and Services
Timeline: 1 Month
In collaboration with: Ally Liu, Hyun Woo Paik, and Rachel Lee
My Role: Wireframing/Prototyping, Technological Feasibility.

This page is unfinished.


Introduction—


Prompt

"Students will work on teams of three or four to envision a novel service that employs natural language processing (NLP) technology. The challenge for this assignment is to discover where the often very limited ability of NLP technology to make sense of people’s messages, conversations, and documents might still deliver value to a target set of users and to a service provider. In making the choice of what to design, teams should focus on the challenges of finding or constructing a labeled dataset needed to produce the inferences. Teams are free to employ labeled datasets that already exist or to design a system to constructs a dataset as people use it."


Objectives
  1. Synthesize at least 20 feasible ideas that take advantage of NLP
  2. Narrow down to 3 of those ideas based on feasibility given current NLP standards, potential business value, and overall utility of the product/service.
  3. Choose 1 of those ideas to pursue for the project.
  4. Prototype to the greatest risk—how do you counteract the greatest weaknesses of your project?
  5. Creating digital prototypes to think about how users would engage with the product/service

Initial Ideas—


20 Ideas

Firstly, we decided to have a google spreadsheet where we could brainstorm and throw in ideas that we had. We ended up breaking these ideas down into a few components: the capability, the activity or domain, the dataset that would be leveraged, the target audience, our intuition of the difficulty, and the user value.

brainstorming

The link to the spreadsheet can be found here.

Our team explored 20 ideas for NLP, and after we examined closely the technology feasibilities (intuition of difficulties) and value to business and customers, we were eventually able to decide on the final three ideas which had both low technology difficulty and high values. Because none of us were too familar with NLP, we had to read more about the capabilities of NLP and how currently it works in order to judge how feasilble certain ideas were. Our team reviewed a few papers and videos that explains a few of the current NLP systems and how they function:


Top 3 Ideas
  1. Generate latex based on text

    Use NLP to go through Latex part of stack overflow to see what people most commonly want to perform on latex and then generate latex based on text given from user. This would utilize the stack-overflow database of previously asked questions and answers.

  2. Filtering Through Product Reviews (Chosen Idea)

    Help retailers e.g. Amazon sellers determine if they have a problem with their products or customer service, and what they are currently doing well.

  3. Facilitating better matches between potential companies and employees

    Help job seekers make a better profile by modelling based the kinds of things that are on successful profiles e.g. skills, wording of bios.

    Determine if people’s LinkedIn profiles have changed in ways that aren’t plausible for screening of truthfulness and integrity of candidate e.g. undergraduate university or major changed throughout time


Narrowing Down: Our Final Idea

We decided as a team to go with our second idea (Using NLP for product reviews) with the help of our professor. The reason we went with that idea was the large amounts of value that it could add for companies with a lower difficulty of implementation. There are already existing NLP algorithms that could help us such as Topic Modelling (which is covered later!) that could justify our reasoning for the feasibility of the service.

At this point, our idea for the service was that it would give users a summary of their product reviews, which would help them analyze next steps of ways that the business can improve their products.


Critique 1—


Presentation & Feedback

Our class had a critique session in which we gave presentations of our progress so far. At this point, our team had gotten to narrowing down to our final idea, but we still had issues with what direction we should go forward with in terms of what type of product reviews we could analyze. Our team was split between using NLP to parse through application reviews (on platforms such as Google Play Store or Apple's App Store), or something like Amazon.

app-store
amazon

The general consensus was that a platform like Amazon had more value to give to the end users of our product, since companies could have a larger amount of products that are being sold. For example, a company like Nike has potentially hundreds of shoes that are sold through Amazon versus an application which probably won't have as many listings.

At this point as well, we had a compelling idea for our design, but we didn't have a compelling story. Our professor said that we should make a case for our choice of service. Another thing that the professor pointed out was that NLP may not be strong enough at this current moment to be able to create summaries of the reviews in an accuracy that is high enough for companies to pay for. However, it is still valuable to have a service that would, for example, count the amount of times a certain negative words appeared in the reviews and raise flags for certain issues a product may be having. Because humans are good at interpretation and comprehension, that task can still be delegated to humans while using NLP to still aid humans in a different, more computational way.

One feature that was proposed was also a competitive analysis feature, applying the same analysis for other products in similar categories as the user's company.


Competitor Analysis—


We looked at a few different services that do a similar service, and from there we looked at how we can use natural language processing in order to provide an even better service.


Amazon Insights
amazon-insights


Sellics
amazon-insights

Both of these services provide insights to businesses on how their products are doing. However, our product could generate more value with the use of NLP to increase access to key highlights in a quicker, more accurate method through analyzing reviews.


Protyping to the Greatest Risk—

"Protyping to the greatest risk in this case is not creating wireframes or creating digital prototypes. If you are creating digital prototypes or wireframes, you are assuming that the interaction between the user and your product is your greatest risk. That's probably not the case for your projects." — Prof. Zimmerman

Features Redefined

thinking-pic

Thinking about features for the system and how it might work.

features

Planning out the features.

At this point, we sought to really define the features of our project. The reason we thought about the features is because the risks of our program are resolved through defining the features and "algorithm" of the product. One feature we thought about was categorization of reviews. We imagine that this feature would help a user to view trends in terms of what categories of reviews correlate with good or bad ratings and the user could make inferences based on that.

Risks:
risks


Navigating Risks:
features-list

Technological Feasibilty

Our risk includes the feasibility of the project. NLP at it's current state is not at a place where we can guarentee the accuracy that would be required of a lot of novel ideas surrounding it. Therefore, we had to look into how feasible our product was. This was a large part of my role in this project.

I looked into different NLP technologies such as word vectors, sentiment analysis, and topic modelling. These different technologies had to be robust enough that they would be able to be used in a commercial context. We used websites such as corenlp to test out how current NLP can analyze reviews. Accuracy for AI in general is, in some form, dependent on the specificity of the domain and the dataset used. Since amazon reviews belong to many different domains, it is hard to analyze these reviews on their own.

Topic modelling provides a way to cluster these reviews into categories that allow for easier analysis.