Exploring
downtown Toronto Neighborhood with
the aim of setting up a Supermarket.
A capstone
report by Oladapo Ajayi
Introduction
In setting up a supermarket, what neighborhood(s)
are best considered for optimum profit in Downtown Toronto area?
A little background shows that Downtown Toronto is the business
hub of the district of Toronto Ontario, Canada.
Its land mass is 14 square kilometer. It is in between
Bloor street, which is to the north of the city Don Valley and Bathurst to the
the east
and west respectively. It’s a very busy area as this is also attributed to the place being the government residential
area apart from the
fact that it is a major business area.
Based on the
aforementioned, I shall focus this report on DownTown Toronto area as it will
be a perfect area to
perform this research as a result of its unique characteristics. Ti have more information on this area, kindly visit the
Wikipedia
page: https://en.wikipedia.org/wiki/Downtown_Toronto
Data Source
I will be using the Foursquare API
Due to the richness and
available from the Foursquare API, the data from this source shall be used for
this research.
It is a a good place to get excellent API for developers. Others
include google. I will be using Foursquare because it is very rich and
detailed.
Fetching Latitude and Longitude
I will be
use a geocoder to get the latitude and longitude coordinates for the DownTown
Toronto.
Web Scrapping
In order to get the
important data for this research, I shall be scrapping the Wikipedia page so
has to have the list of boroughs and
neighborhoods in Toronto. This is
particularly important for my analysis.
I will be using the Wikipedia
page below
Methodology
1. Downloading and cleaning data
In this step we scrape
the required data from the wikipedia page given above. We clean the table and
extract only the information that is needed into a pandas dataframe. i.e. for
Toronto City
2. Geocoding the locations to
obtain georaphical coordinates (latitude, longitude)
Now that you have built
a dataframe of the postal code of each neighborhood along with the borough name
and neighborhood name, in order to utilize the Foursquare location data, we
need to get the latitude and the longitude coordinates
of each neighborhood. We use geocoder library to accomplish this. However, given that
this package can be very unreliable,
in case we are not able to get the
geographical coordinates of the neighborhoods using the Geocoder package,
here
is a link to a csv file that has the geographical coordinates of each postal
code: http://cocl.us/Geospatial_data
A detailed process of
geocoding can be found in the below link:
3. Explore and Analyze the Downtown
Toronto Neighbourhood Data
Here we use the Foursquare API to explore neighborhoods in Toronto
City. We will use the explore function to get the
most common
venue categories in each neighborhood, and then use this feature to group the
neighborhoods into clusters.
Machine Learning
I will use the k-means clustering algorithm to
complete this task. Finally, we will use the
Folium library to visualize the
neighborhoods in Downtown Toronto and their emerging clusters.
Results and
Discussion of Findings
Given that a successful data has been obtained on Downtown Toronto
data, then data analysis can be carried
out in order to aid the results and
discussion of findings.
The below graph shows distribution by histograph: Rosedale, Christie and CN
Tower/Island Airport area have some the least
venues in terms of
numbers assuming venues cannot be greater than 100 for each area plotted below It can be observed that
there are over 100 coffee shops, over 80 cafes and
nearly 45 restaurants & hotels in Downtown Toronto.
We also find that there are only around 10 supermarkets as shown
in the last 5th bar in the map in the whole of Downtown Toronto.
Here is a table showing the top 10 most common venues for each neighborhood in Downtown Toronto.
Exploring
downtown Toronto Neighborhood with
the aim of setting up a Supermarket.
A capstone
report by Oladapo Ajayi
Introduction
In setting up a supermarket, what neighborhood(s)
are best considered for optimum profit in Downtown Toronto area?
Its land mass is 14 square kilometer. It is in between Bloor street, which is to the north of the city Don Valley and Bathurst to the the east
and west respectively. It’s a very busy area as this is also attributed to the place being the government residential
area apart from the fact that it is a major business area.
perform this research as a result of its unique characteristics. Ti have more information on this area, kindly visit the
Wikipedia page: https://en.wikipedia.org/wiki/Downtown_Toronto
Data Source
I will be using the Foursquare API
It is a a good place to get excellent API for developers. Others include google. I will be using Foursquare because it is very rich and detailed.
Fetching Latitude and Longitude
I will be
use a geocoder to get the latitude and longitude coordinates for the DownTown
Toronto.
neighborhoods in Toronto. This is particularly important for my analysis.
Methodology
1. Downloading and cleaning data
extract only the information that is needed into a pandas dataframe. i.e. for Toronto City
2. Geocoding the locations to
obtain georaphical coordinates (latitude, longitude)
and neighborhood name, in order to utilize the Foursquare location data, we need to get the latitude and the longitude coordinates
of each neighborhood. We use geocoder library to accomplish this. However, given that this package can be very unreliable,
in case we are not able to get the geographical coordinates of the neighborhoods using the Geocoder package,
here is a link to a csv file that has the geographical coordinates of each postal code: http://cocl.us/Geospatial_data
3. Explore and Analyze the Downtown
Toronto Neighbourhood Data
most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters.
Machine Learning
Folium library to visualize the neighborhoods in Downtown Toronto and their emerging clusters.
Results and
Discussion of Findings
out in order to aid the results and discussion of findings.
venues in terms of numbers assuming venues cannot be greater than 100 for each area plotted below It can be observed that
there are over 100 coffee shops, over 80 cafes and nearly 45 restaurants & hotels in Downtown Toronto.
It is observed that a supermarket is not in anyway visible e in any top 10 venues of any neighborhood.
The next step is to cluster these neighborhoods based on the above info and plot them on the map to see the overall
emerging clusters.
emerging clusters.
Cluster 0 (Colored red)
This cluster has all the right mix.
This cluster is a perfect mixture
for different venue categories. There are a lots of restaurants and fast food joints. It
also has many
options for recreational activities like pubs, nightclubs and
parks.
Cluster 1 (Colored purple)
There is an airport here. It may not be advisable to set up here
Cluster 2 (Colored blue)
We can observe that eateries and men shops are common here
Cluster 3 (Green)
The cluster in this regard has similar characteristics with clusters under zero.
Cluster 4 (Orange)
Berczy Park could be the best place to set up the supermarket.
Conclusion
In this report we have examined the Downtown Toronto data to find
out which would be an ideal location to setup a grocery store.
We have explored
the venue data obtained from Foursquare API and the data obtained by scraping
the web.
We have then applied various plotting and machine learning techniques
to further explore the insights.
We find that grocery store are not as common
as coffee shops or restaurants.
There are only nearly 10 grocery store in
Downtown Toronto Neighborhood. This study can be used to decide
whether a
location is suitable for setting up a grocery store or not.
Future Scope
But still there is a LARGE scope of improvement in the dataset. We
did not consider one main data entity in this analysis which is People.
A very crucial point
while starting up a business is to analyze the neighborhoods
Cluster 0 (Colored red)
This cluster has all the right mix.
This cluster is a perfect mixture
for different venue categories. There are a lots of restaurants and fast food joints. It
also has many
options for recreational activities like pubs, nightclubs and
parks.
Cluster 1 (Colored purple)
There is an airport here. It may not be advisable to set up here
Cluster 2 (Colored blue)
We can observe that eateries and men shops are common here
Cluster 3 (Green)
The cluster in this regard has similar characteristics with clusters under zero.
Cluster 4 (Orange)
Berczy Park could be the best place to set up the supermarket.
We have explored the venue data obtained from Foursquare API and the data obtained by scraping the web.
We have then applied various plotting and machine learning techniques to further explore the insights.
We find that grocery store are not as common as coffee shops or restaurants.
There are only nearly 10 grocery store in Downtown Toronto Neighborhood. This study can be used to decide
whether a location is suitable for setting up a grocery store or not.
Future Scope




No comments:
Post a Comment