05 April, 2019

INSY 3400 (Programming Assignment 1)

INSY 3400
Programming Assignment 1

It is not always practical to compute the steady state or absorbing probabilities analytically for
complex systems. In this assignment, you will use the programming language of your choosing
to compute the specified probabilities. It is recommended that you use python, but if you have
not yet taken INSY 3010 you may prefer to use matlab or another language that you have used
before.
Consider the skittle and m&m problem from the homework (and exam), that is restated below.
Three mini bags of skittles and three mini bags of M&Ms are distributed in two candy
bowls (one red, one brown) in such a way that each bowl contains three bags of candy. At
each step we draw one candy bag from each dish and exchange them. Let Xn be the
number of bags of skittles in the red candy bowl at time n.
The recommended steps to solve for the steady state probabilities empirically are listed below.
• Create a list of lists for the one-step transition probability matrix P using the appropriate
values. Keep in mind that this is a problem you have already worked, so you may refer
back to your homework to obtain P.
• Generate a random starting state, s, from {0,1,2,3} and record it in the list of states visited
• Create list, referred to as lim below, populated with 0s, to store the number of times each
state has been visited.
• Execute a transition 10,000 times:
o Generate a random uniform number between 0 and 1.
o Depending on what the current state is, determine what the next state will be by
using the conditional pmf for the given state.
 Remember that you can convert the conditional pmf to a cdf, and then find
the bracket containing the generated random number
 For the row of the transition matrix that corresponds to the current state,
add the transition probabilities one by one from left to right until the sum
exceeds the value of the random number. The first time when that happens
determines the next state.
o Add 1 to the entry in lim corresponding to the selected state.
• Divide each entry in lim by the total number of transitions executed to obtain the
estimated steady state probabilities.
Submission Instructions
Upload a code file with all of your code and a pdf with your submission report.
For your code file:
• Put your name in the comments at the top of the file
• Include your empirically determined steady state probabilities in the comments at the top
of the file
• Your code must be commented
• If you are using python, submit a single py file
• If you are not using python, submit all of the files needed to execute your experiment.
For the report:
• The header should include your name (as it appears on canvas) and the name of the
assignment.
• Write a one paragraph introduction that outlines the problem, objective, and states which
programming language you are choosing to use. If you are not using python, include
brief instructions for how to use your submitted files to re-create your results.
• Write a one paragraph analysis of the results. Clearly state what your randomly selected
initial state is, and what computed steady-state values are. Discuss any similarities and/or
differences to the steady state values you have computed by hand in previous
assignments.

21 September, 2018

Project Brief


The University of Sydney has made several significant changes to its infrastructure over the past few years (new buildings, new services, etc.). The intention has been to provide better infrastructure, services and facilities that suit the needs of staff and students while respecting the interests of the local community. The university wants to know how these improvements have changed how staff and students use and move around the university campus. You have been tasked with analysing the change (from 2016 to 2018) in staff, student and visitor movements, investigating other potential improvements and making recommendations to the university for what it should do next.
Since this is a large task, you have been asked to focus on one of the following issues:
• Transport
• Safety and security
• Marketing opportunities
• Retail and food
• Recreation and leisure facilities
• Study facilities
Since the university recognises the importance of geography, you have been provided with a number of spatial datasets to use for your analysis. The datasets have been collected by a number of organisations using a variety of different methods. For this reason, the datasets may contain errors or contradict each other. The same applies to the same type of data collected over several years. The following datasets have been provided:
a) Copies of the wifi access logs for 24 hours in 2016 and for another 24 hours in 2018. These show every time a person connects a device to the university’s wifi network. This includes laptops, smartphones, tablets, raspberry pi and any other device which has a wifi radio. For privacy reasons the data has been anonymised by changing the user ids but it can be assumed that all data (from the same year) with the same user id is collected from the same person.
b) Pedestrian traffic counts
c) Vehicle traffic counts
d) Locations of university buildings
e) Locations of university amenities
f) Opal data for all modes around The University of Sydney (2016 and 2018)
You may also use any other data you have available to you including the data provided to you and any data you find on the internet. You must include a section in your report listing the sources of data.

This problem involves several tasks. Note, there is no ‘right’ or ‘wrong’ answer here, and it is likely you will come up with different solutions to other students. Your performance will be judged based on your correct choice and application of GIS techniques, the logic of your solution and ultimately the extent to which you can make a persuasive argument for your recommendations.
Task 1: Assessment of current situation (2018)
➢ Using appropriate GIS procedures identify the movement patterns (where they go, how many buildings each person goes to, etc.), and how this varies by time of day.
➢ Create one map using R – it is up to you what you show on this map.
➢ Conduct any other analysis necessary to understand the current situation for your specific focus (i.e. retail and food, study facilities, etc.)
Task 2: Assessment of change in movements (relative to 2016)
➢ Take the analysis conducted in task 1 and analyse the change from 2016
➢ Identify the major differences between 2016 and 2018
➢ Determine how the changes in the university differ from changes in the area around the university
Task 3: Final Recommendations
➢ Based on Tasks 1 and 2, identify one option for further improvements to the university’s facilities or services.
➢ Analyse the impact of your proposed improvement using the data provided
➢ Based on your analysis of your identified option and tasks 1 and 2, provide some final recommendations to the university. Your recommendations must be based on the results of your analysis.

Your report must be a maximum of six pages including (one) title page containing a title, course code, date and student number; up to five pages including any maps, tables, references or appendices. If you can cover everything you need to in fewer pages you may do so – the page limit is a maximum, not a target.
It must contain at least the following but how you organise the report is up to you:
• The results of tasks 1, 2 and 3
• One map created using R
• At least two other maps that can be done in R, ArcGIS or QGIS. You may include additional maps if you wish.
• A section listing all the datasets you used and, if they were not provided to you as part of data, where you got them from. In the same section you should also explain any assumptions you have made about the data. A bulleted list is sufficient for this section.

There are 3 Steps in my assignment using proposed Algorithm:


  1. Data Format Classification
  2. Load Balancing
  3. Energy Efficiency
Step 01:  (Data Classification)
  1. Dividing the incoming data by the broker as we use that particular data which supports our assignment. This will include three big domains like Computation Intensive (Mathematical problems etc...), Data Intensive (Weather, DNA sequencing, Digital maps etc...), Security (Military data, https, some keying mechanism) based. Please use appropriate workloads and also forward the details.
  2. Dividing the incoming data into various Formats of data like a).Video/HDs, b). Audio, c). Various Text formats, d). Images/maps, data etc needs to be classified in the above mentioned formats (at least 5 data formats). 
Step 02:  (Energy Efficiency)
So after 01 step is performed, step 02 needs to be implemented. The proposed algorithm must perform energy efficiency besides load balancing, so after implementation, the same is ready. Now, there is a need to calculate various factors as mentioned below in the table for the implemented algorithm which will be used for results generation purpose. 
Step 03:  (Load Balancing)
As the proposed algorithm is selected based on load balancing. Therefore after their complete implementation using Cloudsim load balancing will be achieved. 
Note: One of the paper already emailed is attached for comparison (latest for comparison), please compare the results with that paper and devise suitable/appropriate algorithm which performs well as compared to the attached and also provide the implementation details.
Overall Requirements  
  1. I need separate and combined descriptive graphs as well tables of data results for all three domains in seconds (where time is applicable) as mentioned in Step 1 part 1.
  2. I also need separate and combined descriptive graphs as well tables of data results for all Data Formats in as mentioned in Step 1 part 2.
  3. Various Diagrams are also requested. 




When Computation Intensive Data is applied: 
Metrics /Techniques
Throughout
Time
Overhead
In terms Communication
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In %age
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 
 When Data Intensive is applied:
Metrics /Techniques
Throughout
Time
Overhead
In Number
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In Number
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 
When Security based information is applied:
Metrics /Techniques
Throughout
Time
Overhead
In Number
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In Number
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 
When Video/HDs based information is applied:
Metrics /Techniques
Throughout
Time
Overhead
In Number
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In Number
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 
When Audio based information is applied:
Metrics /Techniques
Throughout
Time
Overhead
In Number
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In Number
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 
When Text based information is applied:
Metrics /Techniques
Throughout
Time
Overhead
In Number
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In Number
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 
When Images based information is applied:
Metrics /Techniques
Throughout
Time
Overhead
In Number
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In Number
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 


When based information is applied:
Metrics /Techniques
Throughout
Time
Overhead
In Number
Fault Tolerance
(Yes/No)
Migration Time
Response Time
Resource Utilization
In Number
Scalability
In Number
Performance
Efficiency
Energy Efficiency
In KW
No. of Migration
No. of SLA Violations
Algorithm 01
 

 
 
 

 
 
 
 
 
Algorithm 02




 

 
 
 
 
 




Please find attached Base papers (especially EE13b01.pdf) & find below the Problem Statement:

In cloud we haven massive volume, located at geographically distributed sites, computation is done over the network and tasks are transact by read/write operations, then dynamic energy consumption and load balancing are not unanticipated but a common aspect. To make cloud load balanced and energy efficient is challenging task. We will propose technique that is capable of doing the energy efficiency as well as load balancing simultaneously in the following manner:

1. Dividing the incoming data by the broker as we use that particular data which supports our assignment. This will include three big domains like 
A. Computation Intensive (Mathematical problems etc...), 
B. Data Intensive (Weather, DNA sequencing, Digital maps etc...), 
C. Security (Military data, https, some keying mechanism) based. Please use appropriate workloads and also forward the details. 

2. Dividing the incoming data into various Formats of data like a).Video/HDs, b). Audio, c). Various Text formats, d). Images/maps, data etc needs to be classified in the above mentioned formats (at least 5 data formats). 

Then at second part perform Energy Efficiency and at Third part perform Load Balancing. 

" Possible suggestion here will be to incorporate optimal resource allocation algorithm with the aim of providing proper utilization of cloud resource. This can improve the overall efficiency of the system. Implementation must be done in Cloudsim 3.0.3 ".