Proceedings ICINC19 PDF [PDF]

  • 0 0 0
  • Gefällt Ihnen dieses papier und der download? Sie können Ihre eigene PDF-Datei in wenigen Minuten kostenlos online veröffentlichen! Anmelden
Datei wird geladen, bitte warten...
Zitiervorschau

Proceeding of

International Conference on IoT, Next Generation Networks & Cloud Computing 2019 (ICINC-2019) ORGANIZED BY

Department of Computer Engineering in association with

Savitribai Phule Pune University

Sinhgad Technical Education Society’s

Smt. Kashibai Navale College of Engineering Vadgaon (Bk.), Pune-411041.

CONFERENCE COMMITTEE CHIEF PATRON Prof. M. N. Navale Founder President,Sinhgad Institutes

PATRON Dr. (Mrs). S. M. Navale Founder Secretary,Sinhgad Institutes

PATRON Mr. R. M. Navale Vice-President (Hr), Sinhgad Institutes

PATRON Mrs. Rachana. Navale - Ashtekar. Vice-President (Admin), Sinhgad Institutes

CONVENOR Dr P. N. Mahalle Professor & Head, Member- BoS Computer Engineering SPPU, Excharirman - BoS Information Technology, SPPU Pune

ORGANIZING SECRETARY Dr. G. R. Shinde Prof. J. N. Nandimath

CORE TECHNICAL COMMITTEE Prof. S. K. Pathan Prof. S. P. Pingat Prof. R. A. Satao Prof. V. S. Deshmukh Prof. V. V. Kimbahune Prof. A. A. Deshmukh Prof. V. R. Ghule Prof. P. S. Desai Prof. P. N. Railkar Prof. P. S. Raskar Prof. S. R. Pavshere Prof. P. A. Sonewar Prof. P. R. Chandre Prof. A. B. Kalamkar Prof. S. A. Kahate Prof. B. D. Thorat Prof. P. S. Teli Prof. P. P. Patil Prof. D. T. Bodake Prof. G. S. Pise Prof. S. P. Patil Prof. M. Tamboli

CORE SUPPORTING HANDS Ms. Manisha Shinde Mr. Sanjay Panchal Mr. Pranesh Holgundikar Mr. Salim Shaikh Ms. Komal Ingole Ms. Deepali Ingole

Message from Principal Desk

Dr A V Deshpande Principal Smt Kashibai Navale College of Engineering, Pune. In the advent of high speed communication tremendous impetus felt to various core sector technology in terms of computer networking. This includes next generation network, advance database technologies like data mining and information retrieval, image and signal processing etc. There is also tremendous advancement like solution system soft computing like cloud computing, grid computing, neural networks, network and cyber security. Internet, web and other services sectors have gone through sea change in last decade. A need was therefore felt to organize this International Conference on ―Internet of Things, Next Generation Network and Cloud Computing 2019‖ ICINC 2019 to acquaint researcher, faculty and students of this college with the latest trends and development in this direction. This conference in deed provides a very useful platform for close intermingle congregation between industry and academic. The conference addresses the trends, challenges and future roadmaps within a conglomerate of existing and novel wireless technologies and recent advances in information theory and its applications.To make the event more meaningful we interacted with premier institutes, organizations and leading industries, spread over the country in the field of computer networking and requested them to demonstrate and share latest technology with participants. I am sure this close interaction with them will enrich us all with knowledge of latest development.

Message from Vice Principal

Dr K R Borole Vice Principal Smt Kashibai Navale College of Engineering, Pune. Warm and Happy greeting to all. I am immensely happy that Department of Computer Engineering of Smt. Kashibai Navale College of Engineering ,Vadgaon (bk), Pune is organizing International conference on ― Internet of Things, Next Generation Networks and Cloud Computing 2019

ICINC-

2019‖, on February 15th to 16th , 2019. The conference addresses the trends, challenges and future roadmaps within a conglomerate of existing and novel wireless technologies and recent advances in information theory and its applications. The conference features a comprehensive technical program including special sessions and short courses. The dedicated Head of Department of Computer Engineering Dr. P.N.Mahalle (Convener), Dr. G.R.Shinde & Prof J. N.Nandimath (Organizing Secretory), staff members and disciplined undergraduate, postgraduate students and research scholars of Smt. Kashibai Navale College of Engineering Vadgaon (bk) Pune are the added features of our college. On this occasion I would like to express my best wishes to this event. I congratulate Head of Department, staff members, students of Computer Engineering Departments, participants from all over India and abroad countries, and colleges for organizing and participating in this conference. I express my sincere thanks to all the authors, invited speakers, session chairpersons, participants and publication of proceeding who did the painstaking efforts of reviewing research papers and technical manuscripts which are included in this proceeding.

Message from Convener & Head of Department

Dr.Parikshit N. Mahalle Head & Professor, Dept of Computer Engineering Smt Kashibai Navale College of Engineering. Pune. It‘s an honor and privilege to host and witness an international conference, congruence of scholarly people who meet and put forward their theory to raise the technology by a notch. I feel proud to see intellectuals from different countries come together to discuss their research and acknowledge others‘ achievements. I would like to quote Jonathan Swift “Vision is the art of seeing what is invisible to others". We have a vision of excelling in the genre of education system and the rankings awarded by various prestigious organizations to our institute are the testimonials to this fact. Our strong foresight helps us to adapt ourselves quite easily to the changing environment, compete with others and make a mark of our own. My heartiest congratulations go to the organizing Committee and participants of ICINC19 for successful conduction of 4th International conference.

Message from Organizing Secretary

Dr. G. R. Shinde Organizing secretory Smt Kashibai Navale College of Engineering, Pune. Dear friends, Adding a new chapter to the tradition of Proceeding of third International conference at our college, I am very happy to place before you the proceeding of 4th International Conference ICINC2019. As an Organizing secretory, allow me to introduce to this proceeding. It consists of 96 papers spread across six domains. I laud my editorial team which has brought out this copy with beautiful and research rich presentations. It is indeed a herculean task. It has been my pleasure to guide and coordinate them in bringing out this proceeding . My sincere thanks to Prof. M. N. Navale - Founder President, STE Society, Pune, Dr. (Mrs) S. M. Navale - Secretary, STE Society,Pune, Ms. Rachana Navale– Ashtekar - Vice-President (Admin), STE Society, Pune, Mr. Rohit M. Navale - Vice-President (HR), STE Society, Pune for their encouragement and support. I would also like to thank my Principal Dr. A. V. Deshpande for his unstinted help and guidance. Dr. K. R. Borole, Vice Principal, Dr. P. N. Mahalle Head Computer Department, have been kind enough in advising me to carry this onerous responsibility of managing the functions of Organizing secretory. I would also like to thank Savitribai Phule Pune University for association with us. I hope the research community will enjoy reading this proceeding during their research time.

Message from Organizing Secretary

Prof. J. N. Nandimath Organizing Secretory Smt Kashibai Navale College of Engineering, Pune. Dear Friends, Research is an important activity of human civilization. It is very crucial for improving the economy of our country and achieving sustainable development. The outcome of research should not be confined to research laboratories and effort must be put so that humanity can benefit from the new developments in research. At the same time, the research education should also be given due importance, in order to attract the young talented persons in this area of research and equip them with the knowledge, information and wisdom suitable for industry. The 4th International Conference on ―Internet of Things, Next Generation Networks and Cloud Computing 2019‖ (ICINC- 2019) aims to provide a common platform for research community, industries and academia. It is also expected to be a wonderful gathering of senior and young professionals belonging to Department of Computer Engineering carrying out research. We wish to thank all the authors, reviewers, sponsors, and invited speakers, members of advisory board and organizing team, student-volunteers and all others who have contributed in the successful organization of this conference. I am very grateful to Prof. M. N. Navale - Founder President, STE Society, Pune, Dr. (Mrs) S. M. Navale - Secretary, STE Society,Pune, Ms. Rachana Navale– Ashtekar - Vice-President (Admin), STE Society, Pune, Mr. Rohit M. Navale - Vice-President (HR), STE Society, Pune for their encouragement and support. I would also like to thank Principal Dr. A. V. Deshpande for his generous help and guidance. Dr. K. R. Borole, Vice Principal, Dr. P. N. Mahalle Head Computer Department, has been kind enough in advising me to carry this arduous responsibility of managing the functions of Organizing secretory. I would also like to thank Savitribai Phule Pune University for association and providing necessary funding.

Index Sr No

Title

Page No

Internet of Things 1

2

3

4 5 6 7

8 9 10 11

12

13 14

15

16 17

Automated Toll Collection System And Theft Detection Using RFID Samruddhi S. Patil, Priti Y. Holkar, Kiran A. Pote, Shubhashri K. Chavan, Asmita Kalamkar WI-FI Based Home Surveillance Bot Using PI Camera & Accessing Live Streaming Using Youtube To Iprove Home Security Ritik Jain, Varshun Tiku, Rinisha Bhaykar, Rishi Ahuja, Prof. S.P.Pingat Smart Dustbin With Metal Detector Dhiraj Jain, Vaidehi Kale, Raksha Sisodiya, Sujata Mahajan, Dr. Mrs. Gitanjali R. Shinde Improvement In Personal Assistant Ashik Raj, Sreeja Singh, Deepak Kumar, Deshpande Shivani Shripad IoT Based Home Automation System For Senior Citizens Ashwathi Sreekumar, Divyanshi Shah, Himanshi Varshney Smart Trafic Control System Using Time Management Gaikwad Kavita Pitambar, More Sunita Vitthal, Nalge Bhagyashree Muktaji The Pothole Detection: Using A Mobile Sensor Network For Road Surface Monitoring Sanket Deotarse,Nate Pratiksha,Shaikh Kash, Sonnis Poonam IoT Based Agricultural Soil Prediction For Crops With Precautions Prof.Yashanjali Sisodia, Pooja Gahile, Chaitali Meher IoMT Healthcare: Security Measures Ms. Swati Subhash Nikam, Ms. Ranjita Balu Pandhare Smart Wearable Gadget For Industrial Safety Ketki Apte, Rani Khandagle, Rijwana Shaikh,Rani Ohal Smart Solar Remote Monitoring and Forecasting System Niranjan Kale, Akshay Bondarde, Nitin Kale, Shailesh Kore, Prof.D.H.Kulkarni Smart Agriculture Using Internet of Things Akshay Kudale, Yogesh Bhavsar, Ashutosh Auti, Mahesh Raykar, Prof. V. R. Ghule Area-Wise Bike Pooling- ―BikeUp‖ Mayur Chavhan, Sagar Tambe,Amol Kharat, Prof. S.P Kosbatwar Smart Water Quality Management System Prof. Rachana Satao, Rutuja Padavkar, Rachana Gade, Snehal Aher, Vaibhavi Dangat Intelligent Water Regulation Using IoT Shahapurkar Shreya Somnath, Kardile Prajakta Sudam, Shipalkar Gayatri Satish, Satav Varsha Subhash Smart Notice Board Shaikh Tahura Anjum Vazir, Shaikh Fiza Shaukat, Kale Akshay Ashok Vehicle Identification Using IOT Miss YashanjaliSisodia, Mr.SudarshanR.Diwate

1

7

12

17 20 25 29

33 36 42 45

50

54 58

62

65 68

18 19

Wireless Communication System Within Campus Mrs. Shilpa S. Jahagirdar, Mrs. Kanchan A. Pujari License Plate Recognition Using RFID Vaibhavi Bhosale , Monali Deoghare, Dynanda Kulkarni, Prof S A Kahate

72 77

Data Analytics and Machine Learning 20

21

22 23

24

25 26

27

28

29

30 31

32

33 34

Online Recommendation System Prof. Swapnil N. Patil, Ms. Vaishnavi Jadhav, Ms. Kiran Patil, Ms. Shailja Maheshwari Intelligent Query System Using Natural Language Processing Kshitij Ingole, Akash Patil, Kalyani Kshirsagar, Pratiksha Bothara, Prof. Vaishali S. Deshmukh Mood Enhancer Chatbot Using Artificial Intelligence Divya Khairnar, Ritesh Patil, Shubham Bhavsar, Shrikant Tale Multistage Classification of Diabetic Retinopathy using Convolutional Neural Networks Aarti Kulkarni, Shivani Sawant, Simran Rathi, Prajakta Puranik Predicting Delays And Cancellation Of Commercial Flights Using Meteorological And Historic Flight Data Kunal Zodape, Shravan Ramdurg, Niraj Punde, Gautam Devda, Prof. Pankaj Chandre, Dr. Purnima Lala Mehta, A Survey on Risk Assessment in Heart Attack Using Machine Learning Rahul Satpute, Irfan Husssain, Irfan Husssain, Prof. Piyush Sonewar Textual Content Moderation using Supervised Machine Learning Approach Revati Ganorkar, Shubham Deshpande, Mayur Giri, Gaurang Suki, Araddhana Deshmukh Survey Paper on Location Recommendation Using Scalable ContentAware Collaborative Filtering and Social Networking Sites Prof. Pramod P. Patil, Ajinkya Awati, Deepak Patil, Rohan Shingate, Akshay More Anonymous Schedule Generation Using Genetic Algorithm Adep Vaishnavi Anil, Berad Rituja Shivaji, Myana Vaishnavi Dnyaneshwar, Pawar Ashwini Janardhan A Survey on Unsupervised Feature Learning Using a Novel Non Symmetric Deep Autoencoder(NDAE) For NIDPS Framework Vinav Autkar, Prof P R Chandre, Dr. Purnima Lala Mehta Turing Machine Imitate Artificial Intelligence Tulashiram B. Pisal, Prof. Dr. Arjun P. Ghatule A Survey on Emotion Recognition between POMS and Gaussian Naïve Bayes Algorithm Using Twitter API Darshan Vallur, Prathamesh Kulkarni, Suraj Kenjale, Suraj Kenjale Anti-Depression Chatbot In Java Manas Mamidwar, Ameya Marathe, Ishan Mehendale, Abdullah Pothiyawala, Prof. A. A. Deshmukh Emotion Analysis on Social Media Platform using Machine learning Shreyas Bakshetti, Pratik Gugale, Sohail Shaikh, Jayesh Birari Stock Market Prediction Using Machine Learning Techniques Rushikesh M. Khamkar, Rushikesh P. Kadam, Moushmi R. Jain, Moushmi R. Jain

81

86

92 96

101

109 115

122

127

131

138 145

150

158 164

35

36

37 38

39 40

41

42

Stock Recommendations And Price Prediction By Exploiting Business Commodity Information Using Data Mining And Machine Learning Techniques Dr. Parikshit N. Mahalle, Prof P R Chandre, Mohit Bhalgat, Aukush Mahajan, Priyamvada Barve, Vaidehi Jagtap A Machine Learning Model For Toxic Comment Classification Mihir Pargaonkar, Rohan Nikumbh, Shubham Shinde, Akshay Wagh, Prof. D.T. Bodake Holographic Artificial Intelligence Assistance Patil Girish, Pathade Omkar,Dubey Shweta, SimranMunot Personal Digital Assistant To Enhance Communication Skills Prof. G.Y. Gunjal, Hritik Sharma, Rushikesh Vidhate, Rohit Gaikwad, Akash Kadam Fake News Detection Using Machine Learning Kartik Sharma, Mrudul Agrawal, Malav Warke, Saurabh Saxena Cost-Effective Big Data Science in Medical and Health Care Applications Dr. S. T. Patil, Prof. G. S. Pise AI – Assisted Chatbots For E-Commerce To Address Selection Of Products From Multiple Categories Gauri Shankar Jawalkar, Rachana Rajesh Ambawale, Supriya Vijay Bankar, Manasi Arun Kadam, Dr. Shafi. K. Pathan, Jyoti Prakash Rajpoot Distributed Storage, Analysis, And Exploration Of Multidimensional Phenomena With Trident Framework Nikesh Mhaske, Dr Prashant Dhotre

172

178

186 191

194 199

206

216

Data Mining and Image Retrieval 43

44

45 46

47 48

49

Utilising Location Based Social Media For Target Marketing In Tourism: Bringing The Twitter Data Into Play Prof. G. S. Pise, Sujit Bidawe, Kshitij Naik, Palash Bhanarkar, Rushikesh Sawant Cross Media Retrieval Using Mixed-Generative Hashing Methods Saurav Kumar,Shubham Jamkhola, Mohd Uvais, Paresh Khade, Mrs Manjusha Joshi An Efficient Algorithm For Mining Top-K High Utility Itemset Ahishek Doke, Akshay Bhosale,Sanket Gaikwad,Shubham Gundawar Sarcasm Detection Using Text Factorization On Reviews Tejaswini Murudkar, Vijaya Dabade, Priyanka Lodhe, Mayuri Patil, Shailesh Patil Prediction On Health Care Based On Near Search By Keyword Mantasha Shaikh, Sourabh Gaikwad, Pooja Garje, Harshada Diwate Crime Detection And Prediction System Aparna Vijay Bhange, Shreya Arish Bhuptani, Manjushri Patilingale, Yash Kothari, Prof. D.T. Bodake Academic Assessment With Automated Question Generation And Evaluation Kishore Das, Ashish Kempwad, Shraddha Dhumal, Deepti Rana, Prof. S.P. Kosbatwar

222

227

232 239

242 249

254

50 51

52 53 54

55

56

57

58

59

60

61

62

A Comprehensive Survey For Sentiment Analysis Techniques Amrut Sabale, Abhishek Charan, Tushar Thorat, Pavan Deshmukh E – Referencing Of Digital Document Using Text Summarization Harsh Purbiya, Venktesh Chandrikapure, Harshada Sandesh Karne, Ishwari Shailendra Datar, Prof. P. S. Teli Online Shopping System With Stitching Facility Akshada Akolkar, Dahifale Manjusha, Chitale Sanchita A Survey On Online Medical Support System Shivani J. Sawarkar, G.R. Shinde Natural Language Question Answering System Using Rdf Framework Maruti K. Bandgar, Avinash H. Jadhav, Ashwini D. Thombare, Poornima D. Asundkar, Prof.P.P.Patil Technique For Mood Based Classification Of Music By Using C4.5 Classifier Manisha Rakate, Nandan More Super Market Assistant With Market Basket And Inventory Analytics Aditya Kiran Potdar, Atharv Subhash Chitre, Manisha Dhalaram Jongra, Prasad Vijay Kudale, Prema S. Desai Analysis And Prediction Of Environment Near A Public Place Bhagyesh Pandey, Rahul Bhati, Ajay Kuchanur, Darshan Jain, S.P. Kosbatwar Secure Cloud Log For Cyber Forensics Dr V.V.Kimbahune, Punam Shivaji Chavan, Priyanka Uttam Linge, Pawan Bhutani Traffic Flow Prediction With Big Data Nitika Vernekar, Shivani Naik,Ankita More, Dr V V Kimbahune, Pawan Bhutani Determining Diseases Using Advance Decision Tree In Data Mining Technology Vrushali Punde, Priyanka Pandit, Sharwari Nemane Survey Paper on Multimedia Retrieval Using Semantic Cross Media Hashing Method Prof.B.D.Thorat, Akash Parulekar, Mandar Bedage, Ankit Patil ,Dipali Gome Modern Logistics Vehicle System Using Tracking And Security Arpit Sharma , Bakul Rangari , Rohit Walvekar , Bhagyashree Nivangune , Prof .G.Gunjal

258 263

268 272 280

284

290

295

300

304

309

314

318

Network and Cyber Security 63

64

65

Online Voting System Using OTP Archit Bidkar,Madhabi Ghosh,Prajakta Madane,Rohan Mahapatra,Prof. Jyoti Nandimath Accident Detection And Prevention Using Smartphone Sakshi Kottawar, Mayuri Sarode, Ajit Andhale, Ashay Pajgade, Shailesh Patil Generation of Multi-Color QR Code Using Visual Secret Sharing Scheme Nirup Kumar Satpathy, Sandhya Barikrao Ingole, Pari Sabharwal, Harmanjeet Kour

324

330

335

66

67 68

69 70

71 72

73

Verifying The Integrity Of Digital Files Using Decentralized Timestamping On The Blockchain Akash Dhande, Anuj Jain, Tejas Jain, Tushar Mhaslekar, Prof. P. N. Railkar, Jigyasa Chadha Smart Phone Sensor App Using Security Questions Prof.Yashanjali Sisodia, Miss.Monali Sable, Miss.Rutuja Pawar A Survey on Privacy Awareness Protocol for Machine to Machine Communication in IoT Apurva R. Wattamwar, Dr. P. N. Mahalle, D. D. Shinde Survey on Security Enhancement In Network Protocol Jagdish S. Ingale, Pathan Mohd Shafi, Jyoti Prakash Rajpoot Distributed Access Control Scheme for Machine to Machine Communication in IoT Using Trust Factor Miss. Nikita D. Mazire, Dr. Vinod V. Kimbahun, D. D. Shinde Multimodal Game Bot Detection Using User Behavioral Characteristics Prof. P.R.Chandre,Kushal Matha ,Kiran Bibave, Roshani Patil, Mahesh Mali Mediblock- A Healthcare Management System Using Blockchain Technology Gayatri Bodke, Himanshu Bagale, Prathamesh Bhaskarwar, Mihir Limaye, Dr S K Pathan, Jyoti Prakash Rajpoot Survey On Multifactor Authentication System

340

345 353

359 365

371 375

379

Nisha Kshatriya, Aishwarya Bansude, Nilesh Bansod, Anil Sakate

Cloud Computing 74

75

76 77

78 79

Cloud Stress Distribution And De-Duplication Check Of Cloud Data With Secure Data Sharing Via Cloud Computing Amruta Deshmukh,Rajeshri Besekar,Raveena Gone,Roshan Wakode, Prof. D.S.Lavhkare Efficient Client-Side Deduplication Of Encrypted Data With Improved Data Availability And Public Auditing In Cloud Storage Akash Reddy, Karishma Sarode, Pruthviraj Kanade,Sneha M. Patil A Novel Methodology Used To Store Big Data Securely In Cloud Kale Piyusha Balasaheb, Kale Piyusha Balasaheb, Ukande Monika Prakash Survey Paper on Secure Heterogeneous Data Storage Management with Deduplication in Cloud Computing Miss. Arati Gaikwad, Prof. S. P. Patil Survey on A Ranked Multi-Keyword Search in Cloud Computing Mr.Swaranjeet Singh, Prof. D. H . Kulkarni Private Secure Scalabale Cloud Computing Himanshu Jaiswal, Sankalp Kumar, Janhvi Charthankar, Sushma Ahuja

384

389

397 402

411 417

Image & Signal Processing 80

81

Indoor Navigation Using Augmented Reality Prof. B. D. Thorat, Sudhanshu S. Bhamburkar, Sumit R. Bhiungade, Harshada S. Kothawade, Neha A. Jamdade AI Based Lesion Detection System Mayuri Warke, Richa Padmawar, Sakshi Nikam, Veena Mahesh, Prof. Gitanjali R. Shinde, D. D. Shinde

423

430

82

Leap Virtual Board: Switchless Home Appliances Using Leap Motion Aakanksha kulkarni, Sakshi chauhan, Vaishnavi sawant , Shreya satpute, Prof P.N Railkar, Jigyasa Chadha 83 Recognition Of Fake Indian Currency Notes Using Image Forgery Detection Kishory Chavan,Rutuja Padwad,Vishal Pandita,Harsh Punjabi, Prof P S Raskar, Jigyasa Chadha Detection84Of Suspicious Person And Alerting In The Security System Avani Phase,Purva Puranik,Priyal Patil, Rigved Patil,Dr Parikshit Mahalle, D. D. Shinde 85 Adaptive Computer Display For Preventing Computer Vision Syndrome Manpreet Kaur, Dhanashri Yadav, Ruhi Sharma, Aman Katiyar,Bhakti Patil 86 AAS [Automated Attendance System] Using Face Discernment And Recognition Using Faster R-Cnn, Pose Correction & Deep Learning Mohit Vakare, Amogh Agnihotri, Adwait Sohoniand Sayali Dalvi, Prof. Araddhana Arvind Deshmukh 87 A Survey Of Current Digital Approaches To Improve Soil Fertility

436

442

448

456

460

465

Rahul Nikumbhe, Jaya Bachchhav, Ganesh Kulkarni, Amruta Chaudar

88 89

90 91

92

93

94

95 96

97

IoT Based Polyhouse Monitoring And Controlling System Shelke Snehal, Aware Yogita, Sapkal Komal, Warkad Shweta Adoption Of E-Learning In Engineering Colleges For Training The Students Santosh Borde , Yogesh Kumar Sharma Ict Gadget: Design Of E-Learning System For Rural Community Ansari M A, Yogesh Kumar Sharma Diesease Infected Crop Identification Using Deep Learning and Sugestion of Solution J. N. Nandimath, Sammit Ranade, Shantanu Pawar, Mrunmai Patil Crop Recommendation Based On Local Environmental Parameters Using Machine Learning Approach Saurabh Jadhav, Kaustubh Borse, Sudarshan Dhatrak, Milind Chaudhari A Survey On Key Distribution And Trust Based Scheme On Big Data Analysis For Group User On Cloud Service Mrunal S.Jagtap, Prof.A.M.Wade Survey On Mining Online Social Data For Detecting Social Network Mental Disorders Miss. Aishwarya Uttam Deore, Prof. Aradhana A. Deshmukh

470

Survey On Secure Cloud Log For Cyber Forensics Arati S. Patil, Prof. Rachana A. Satao Analysis And Evaluation Of Privacy Policies Of Online Services Using Machine Learning Ashutosh Singh, Manish Kumar, Rahul Kumar, Dr. Prashant S. Dhotre Web Image Search Re- ranking Dependent on Diversity Nagesh K Patil, S B Nimbekar

513

474

482 488

496

500

507

519

525

INTERNET OF THINGS

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

AUTOMATED TOLL COLLECTION SYSTEM AND THEFT DETECTION USING RFID Samruddhi S. Patil1, Priti Y. Holkar2, Kiran A. Pote3, Shubhashri K. Chavan4, Asmita Kalamkar5 1,2,3,4,5

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT In country like India, manual toll collection is quite time consuming due to the overwhelming number of populations. Vehicles line up forming long queues for toll collection is a cumbersome task. By taking under consideration all these problems, we have come up with an automated toll collection system and theft detection using RFID. In this system, toll is automatically deducted from the customer‘s account and he/she is notified about the same through a text message. In case of an accident or theft, if the car happens to pass through a toll plaza, it can be blocked there itself. Moreover, there will be a display on the plaza which will show the deducted toll cash to the person assigned for monitoring the toll functioning. Keywords Automated Toll Collection, Radio Frequency Identification (RFID), Global System of Mobile (GSM), Arduino ATMega328, Theft Detection unsolved because the vehicles involved 1. INTRODUCTION The national highway network in India is a could not be recognized accurately as network of trunk roads of over 1,15,435. recognizing them manually is very The Government of India plans various difficult and cumbersome. policies for national highways. The [1] Also, in today‘s implemented toll system, Government of India or National Highway at most of the toll plazas, toll is being Authority of India (NHAI) works in collected manually which has become a public-private partnership model for tedious job as the vehicles are made to line highway development. Thus, the up in long queues and involves more time government collects toll tax for for toll collection. However, it involves maintenance and construction. In India huge manpower to carry out redundant there are about 468 toll plazas. While work. Another way which is adapted national highway constitutes 2.7% of recently, is using FASTags which work Indian roads, they carry 40% of traffic. like RFID tags. Moreover, it is being With such a heavy traffic flowing on practiced on a small scale with less national highways, the toll collection objectives. But the system efficiency is not needs to be made as fast as possible to taken care of. avoid long queues of vehicles. So, we are emphasizing to build in the As population of India is increasing day by loopholes and adding more objectives day, the number of private as well as along with toll collection by implementing public vehicles are also increasing. This an automated toll collection system which increase in number of vehicles is also will take into consideration wider serving a reason for increase in traffic and objectives such as accidental scenarios and various crimes associated with it. [2] theft detection. Various cases of theft, hit and run, However, Radio Frequency Identification robbery, kidnapping, smuggling is technology has now come at the boom increasing day by day and reported. which is being used in many sectors on a Though the number of crimes is large scale. Mainly, RFID is used for increasing, many such crimes remain ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 1

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

tracing vehicles, government sectors, Aerospace and Healthcare. The system proposed in this paper is based on an automated toll collection using RFID [3]. An RFID tag is attached to each vehicle for the unique identification of the vehicle and RFID readers are placed at all the toll plazas. When a vehicle comes within the range of an RFID reader placed on the toll plaza, the reader reads the RFID tag through Radio frequency and sends the information to the system through Arduino ATMega328. The details pertaining the owner is retrieved from the database by matching the vehicle number provided and thus, the required owner‘s details are displayed on the desktop provided on the toll plaza. Automatic deduction of toll is carried out from the user‘s prepaid account and the user is notified about the same through a text message through the GSM incorporated with the Arduino ATMega328 micro-controller. Once the toll is deducted, the barricades are opened up and the vehicle can safely pass through it. In another scenario, if a vehicle is stolen, the owner can file a complaint against the stolen vehicle and a FIR No will be assigned to it by the police. The FIR No. will then be used to blacklist the vehicle in the central database through user application. When the stolen vehicle or any blacklisted vehicle shall pass through any of the toll gates, the barricades will block the vehicle right there. Similarly, in a hit and run case, if anyone notes the vehicle no. the information will be sent to all the toll plazas and the vehicle no. will be blacklisted. Thus, vehicle can be blocked when it happens to pass through any of the toll plazas. 2. LITERATURE SURVEY In this paper [1] the author presents a brief review about the toll collection system present in India, their advantages and disadvantages and proposes an efficient model for toll collection using Computer Vision Vehicle Detection for Toll

ISSN:0975-887

Collection System Using Embedded Linux platform. In his proposed system, a camera will capture an image of the arrived vehicle at toll plaza and depending on the size of the vehicle detected by camera, appropriate amount of toll is charged. And also, this system can be used to count the moving vehicles from stored videos. In this [2] paper, an algorithm is proposed to recognize Indian vehicle number plates. A camera is used to capture the image of the vehicle passing through toll plaza which then will be used to retrieve the vehicle number and using the vehicle number the toll amount from the respective account can be deducted. This algorithm addresses the problem of scaling and recognition of position of characters with a good accuracy. In this paper [3], the concept of Automated Toll Collection using a low cost and low power consuming microcontroller MSP430 Launch pad is discussed where they have used an approach where a traveller will pay the toll while in motion using RFID which will in turn save time, effort and man power. Also, the number of vehicles passing through toll plaza and number of times the vehicle passed through that toll plaza in a day is stored in database. The owner will receive an SMS message on his/her mobile about the details of the payment. This paper [4] compares the spectral range of the current RFID system with the future scenario where the modification of the spectral range for the TAV project is done and examined whether there was degradation of performance in the reading rate of RFID systems that were already implemented. In this paper the author [5] discusses about various threats posed while using RFID tags like privacy leakage when tags are read by an unauthorized reader. The author also proposes salted hash algorithm to avoid this theft where authentication of both tag and the reader is done without leaking any important and vulnerable values to the reader where the algorithm

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 2

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

responds to the reader with a random number each time it proposes a query. In this paper [6], RFID technology is used for development of tracking system for vehicles. Also, the paper addresses the major problems like traffic signal timings, congestion due to vehicles and theft of vehicles which can be detected using track logs of the vehicles. In this paper [7], the main study is done to explore the various existing approach of toll collection in India and also to their merits and demerits are discussed. Also, they have addressed the prevention of motorists and toll authorities‘ manually performance of ticket payments and to check driving without proper document, overloaded vehicle and others respectively. In [8] this paper RFID tag was tested against harsh environmental conditions like -30oC blast freezing and exposure to gamma irradiation. Also, survivability of the tag was checked by following criteria: read/write ability at different distances and within time threshold and data integrity of pre-encoded data before and after each test. The [9] author compared three different toll collection systems i.e. manual, semiautomated using pre-paid card and automated toll collection system using RFID technology. The survey conducted for ETC had following results: a) About 65% of the ETC user stated that higher transaction speed was the main reason of using ETC b)87% of respondents stated that it was easy to add the balance in the card c)66% of respondent had no problem in transaction d) Finally about 83% respondent were satisfied with the existing condition of the ETC. In this paper [10], author has proposed a system for automatic vehicle tracking, time management and also for automation of Toll gate. In this system, a computerized system automatically identifies an approaching vehicle and records the vehicle number & Time, it automatically opens the Toll Gate and a

ISSN:0975-887

predetermined amount is automatically deducted from its account. In this [11] paper, Vehicle Number Recognition (VNR) which is an image processing technology which uses efficient algorithms to detect the vehicle number from real time images and implemented it for automatic toll tax collection. In this paper [12], a system that enables road users to pay the toll fees without stopping or slowing down was proposed and developed. They proposed Global Positioning System (GPS)-based highway toll collection system. In general, the system utilized GPS coordinates to detect whether a vehicle passed through predefined locations and if so the respective toll amount will be deducted and also the travel details are recorded. In this paper [13], a fully passive printable Quick Response (QR) code embedded chip less RFID (Radio Frequency Identification) technique is presented for secure identification of alive and nonalive amenity. This paper proposes a better technology than barcode for identification purpose. Here, a series of QR codes are printed in the form of a resonator in passive RFID tag, and the coded information is retrieved through frequency domain reflectometry method for identification. This tag can be read from a distance of about 2 km efficiently. In this paper [14], design of an algorithm for vehicle identification by recognizing the number plate is presented. Also, this paper represents the concept of classification of a vehicle based on the image captured into small, medium and large vehicle so as to deduct toll amount based on it, Here, Genetic algorithm (GA) is employed at two levels: for detecting vehicle from traffic image and recognizing character from the number plate. Detection is based on contour and shape information. In this [15] paper, the problem to make RFID reader read better is addressed. For this problem, they propose a method for optimizing the position of passive UHF RFID tags. Firstly, a relative ideal test

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 3

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

environment was built and then, for each location of the label attached on the container, distance between the container and the antenna along a fixed direction was changed. Finally, they concluded on how to determine the preferred location of a RFID tag. 3. GAP ANALYSIS In India, almost all toll collection on toll plazas is done manually. Also due to large population and heavy road transportation it is time consuming and causes traffic congestion on toll plazas. While there are some toll plazas in India which have started to implement electronic toll collection, but is not being implemented on large scale. Though there are many proposed systems for implementing automated toll collection, however issue of theft detection is not addressed so far. So, to enhance the current systems, we are proposing automated toll collection with theft detection to overcome time consumption, long queues, fuel wastage and to identify stolen vehicles. 4. PROPOSED SYSTEM In this proposed system we are using RFID (Radio Frequency Identification) technology. This technology makes the use of radio frequency to identify the objects. Thus, RFID technology will enable the automatic toll collection which conserves time and energy and presents an efficient system for automation transaction.

In the proposed system RFID tags are used. They can be attached in the front portion i.e. wind shield of the vehicle or the side portion of the vehicle. Passive tags are being used because of their feasibility. Passive tags do not have their own battery. When the vehicles enter the toll gates the active device here i.e. readers will emit the radio waves, as soon as these waves contacts with tags, it produces the magnetic field. The same draws the power out of it and sends the data to the controller. The reader is connected to the microcontroller. Arduino ATMega328 is used as microcontroller here. The reader scans the tags and sends it to the main system here it is Arduino. Then Arduino checks it with the database for that unique ID. There will be a user interface on the desktop at the toll plaza. After checking the information from the database details are displayed on the user interface. If Details are matched the amount is deducted and command is issued to the servo motor to lift up the barricade. A central database is maintained which consists of t So as soon as the vehicle enters the toll plaza RFID tag is scanned and information regarding the vehicle is displayed. Toll is automatically deducted. And the message is sent to the registered mobile number using GSM technology. In case if the RFID ID or number is not matched then the barricades will not be lifted up and the vehicle will be blocked there. This is theft detection. For the movement of barricades servo motor is used.

Fig 1: Block Diagram

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 4

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

4.1. Proposed pseudocodeAlgorithm Check_Vehicle (No:RFID_Number): node=find_number(No); if(node.status=="Blocked"): sendmsg_to_userofcar(); sendsignal_to_barrigates(); else: if(node.amount>200): if(node.timer100): send_warning_msg(); else: send_redalert_msg(); sendmsg_to_usertoaddmoney(); end;

Fig 2: Flow of actions at toll plaza

In another case if the vehicle prepaid account is not having ample amount the vehicle is asked to go to another lane i.e. where manual toll is being collected for toll collection. A central database is maintained. It consists of the unique IDs and the information of the vehicle having that FID. It consists of the parameters to find theft. The GSM module is there, which sends the message to the registered mobile number when the toll is deducted and along with that the location of the toll is also sent. he information of valid user and its vehicle. In proposed system, the hardware that are required are as follows:     

Arduino ATMega328 Passive RFID tag RFID Reader GSM Module Stepper Motor

Using these hardware components, the automated toll collection and theft detection can be possible.

ISSN:0975-887

5. CONCLUSION AND FUTURE WORK In this paper, the concept of Automated Toll Collection is presented using Aurdino ATMega328. Here we have used an innovative approach where a traveller will be able to pay the toll while in motion using RFID communication technology. This approach will save travelling time, avoid traffic congestion, less man power required and there will be no hassle of leasing the money. As the important feature of the project is theft detection, so when a vehicle is stolen and it passes through the toll gate then it will be detected and a proper action would be taken. Thus, theft detection would have impact at large scale. In future a separate application could be provided for tracking of stolen or suspicious vehicles for the police. Also, tracking of stolen vehicle can be done. And at the same time multilane and barricade-less toll gate system can be created. REFERENCES [1] Mr.

Abhijeet Suryatali, Mr. V. B. Dharmadhikari, ―Computer Vision Based Vehicle Detection for Toll Collection System

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 5

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Using Embedded Linux ",2015 International Conference on Circuit, Power and Computing Technologies [ICCPCT]. Hanit Karwal, Akshay Girdhar,"Vehicle Number Plate Detection System for Indian Vehicles",2015 IEEE International Conference on Computational Intelligence & Communication Technology. Sana Said Al-Ghawi, Muna Abdullah Al Rahbi, Dr. S. Asif Hussain, S. Zahid Hussain," AUTOMATIC TOLL E-TICKETING SYSTEM FOR TRANSPORTATION SYSTEMS ", 2016 3rd MEC International Conference on Big Data and Smart City. Renata Rampim de Freitas Dias, Hugo E. Hernandez-Figueroa, Luiz Renata Costa, ―Analysis of impacts on the change of frequency band for RFID system in Brazil ―, Proceeding of the 2013 IEEE International Conference on RFID Technologies and Applications, 4 - 5 September, Johor Bahru, Malaysia. Pinaki Ghosh, Dr. Mahesh T R, ―A Privacy Preserving Mutual Authentication Protocol for RFID based Automated Toll Collection System‖, November 2016. A.A. Pandit, Jyot Talreja, Ankit Kumar Mundra, ―RFID Tracking System for Vehicles (RTSV)",2009 First International Conference on Computational Intelligence, Communication Systems and Networks. K. Gowrisubadra, Jeevitha.S, Selvarasi.N, "A SURVEY ON RFID BASED AUTOMATIC TOLL GATEMANAGEMENT ",2017 4th International Conference on Signal Processing, Communications and Networking (ICSCN 2017), March 16 – 18, 2017, Chennai, INDIA. Alfonso Gutierrez, F. Daniel Nicolalde, Atul Ingle, Clive Hohberger, Rodeina Davis, William Hochschild and Raj Veeramani,"High-Frequency RFID Tag Survivability in Harsh Environments Use of RFID in Transfusion Medicine",2013 IEEE International Conference on RFID.

ISSN:0975-887

[9] Rudy

[10]

[11]

[12]

[13]

[14]

[15]

Hermawan Karsaman, Yudo Adi Nugraha, Sri Hendarto, Febri Zukhruf,"A COMPARATIVE STUDY ON THREE ELECTRONICS TOLL COLLECTION SYSTEMS IN SURABAYA",2015 International Conference on Information Technology Systems and Innovation (ICITSI) Bandung – Bali, November 16 – 19, 2015 ISBN: 978-1-4673-6664-9. Janani Krishnamurthy, Nitin Mohan, Rajeshwari Hegde, "Automation of Toll Gate and Vehicle Tracking‖, International Conference on Computer Science and Information Technology 2008. Shoaib Rehman Soomro Mohammad Arslan Javed Fahad Ahmed Memon," VEHICLE NUMBER RECOGNITION SYSTEM FOR AUTOMATIC TOLL TAX COLLECTION",7 December 2012. Jin Yeong Tan, Pin Jern Ker*, Dineis Mani and Puvanesan Arumugam, ―Development of a GPS-based Highway Toll Collection System ",2016 6th IEEE International Conference on Control System, Computing and Engineering, 25–27 November 2016, Penang, Malaysia. G. Srivatsa Vardhan, Naveen Sivadasan, Ashudeb Dutta,"QR-Code based Chipless RFID System for Unique Identification",2016 IEEE International Conference on RFID Technology and Applications (RFID-TA). P. Vijayalakshmi, M. Sumathi, ―Design of Algorithm for Vehicle Identification by Number Plate Recognition‖, IEEE- Fourth International Conference on Advanced Computing, ICoAC 2012 MIT, Anna University, Chennai. December 13-15, 2012. Zhu Zhi-yuan, Ren He, Tan Jie, "A Method for Optimizing the Position of Passive UHF RFID Tags ―, Program for the IEEE International Conference on RFID-Technology and Applications, 17 - 19 June 2010 Guangzhou, China.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 6

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

WI-FI BASED HOME SURVEILLANCE BOT USING PI CAMERA & ACCESSING LIVE STREAMING USING YOUTUBE TO IMPROVE HOME SECURITY Ritik Jain1, Varshun Tiku2, Rinisha Bhaykar3, Rishi Ahuja4, Prof. S.P.Pingat5 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India.

ABSTRACT There are various surveillance systems such as camera, CCTV etc. available. In these types of surveillance systems, the person who is stationary and is located in that particular area can only view what is happening in that place. We proposed a system to build a real-time live streaming and monitoring system using Raspberry pi with installed Wi-Fi connectivity. Whereas we can monitor the movements in 360 degrees which is accomplished with the help of motors. Also we are going to detect gas leakage. By using video cameras, information returned by ROBOT analyzed the real time images so that the computation effort, cost and a resource requirements needed are significantly decreased. .Raspberry pi is a simple circuit .The 1. INTRODUCTION Traditionally, [1] surveillance systems operating system used is Raspbian OS. are installed in every security critical Gas leakage being one of the most areas. These systems generally consist of frequently observed high quality cameras, multiple computers parameter, and is extremely harmful. So, for monitoring, servers for storing these proposed system is capable of monitoring videos and many security personnel for this value indefinitely without any delay. monitoring these videos. When considered Our proposed system is implemented on as a whole, these systems can yield great Raspberry Pi and interfaced with gas complexities while installing as well as for sensor and controlling the device and also their maintenance. The CCTV camera live video streaming is implemented for feeds are only visible in certain locations quick actions. Mobile video surveillance and they also have limited range within system has been envisioned in the which these can be viewed. Above all literature as either classical video these, the cost of implementations of these streaming with an extension over wire and systems is so high that they cannot be wireless network system to control the installed in every household. human operator. Remote monitor is Raspberry pi is a credit-card sized becoming an important maintenance computer. Its functions are almost as a method that is based on the network. There computer. There are various existing are two units Raspberry Pi Unit and surveillance systems such as camera, Process unit with wireless link between CCTV etc., in these types of surveillance them. Sensor unit will send sensor reading systems, the person is stationary and is to Raspberry Pi Unit which will be located in that particular area can only uploaded to the server. The Pi camera will view what is happening in that place. be connected to Raspberry Pi CSI camera Whereas, here, even if a person is moving port. from place to place. The main advantage of this system is can be used in security 2. MOTIVATION purpose and another advantage is that it A robot is generally an electro-mechanical can offers privacy on both sides since it is machine that can perform tasks being viewed by only authorized person automatically. Security is one of the

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 7

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

applications that everyone needs to be controlled remotely. Nowadays, houses are getting robbed by burglars and gas leakages are causing fire hazards. By 2020, most of the homes will have home surveillance systems. 3. STATE OF ART Smart Security Camera using Raspberry pi and OpenCV is a system constructed for surveillance and it is designed to be used inside a warehouse facility. This system is devised using a low-cost security camera with night vision capability using a raspberry pi. This system is having the ability of gas leakage detection that can be used to avoid potential crimes and potential fire. [6] Basically, two gear motors are sufficient to produce the movement of spy robot and the motor driver module is used to supply enough current to drive two gear motors which protects the Raspberry-pi module from the damage. The major advantage of using the minimum number of gear motor is minimizing the power consumption. The researchers evolved a light-footed surveillance camera that has the potential of identifying the condition of the scene that is being monitored and also gives notification or alarm as the event occurs. This system also provides security during

night time as it is having the potential to provide night vision. Night vision capability is attained by simply taking off infra-red (IR) filter from an ordinary webcam and thus can be used for night vision sensing with the help of IR Light Emitting Diode illuminator. Multi-environment robot for surveillance and live streaming is developed to assemble real-time surveillance system possible within a local network. The live streaming is accomplished using mjpg streamer and the server-client model is build using java. As IP-based installation provide access from anywhere and hence are preferred over the analogue system. IP-based systems offer superior picture quality and they are also favorable when it comes to scalability and flexibility. But IP -based system needs some knowledge about networking and these systems are too expensive than the analog ones. This raspberry pi controlled robot is incorporated by a server-client model. This client-server model is constructed on java and thus can work on any systems such as windows, Mac or Linux. This entire model is connected to a local network and anyone available in that particular local network can control it from anywhere. The live streaming is done by MJPG streamer.

4. GAP ANALYSIS Table 1: Gap Analysis

Sr. No.

Paper Name

1.

2.

Publication

Concept

Implementation of Spy 2017 Robot for A Surveillance System using Internet Protocol of Raspberry Pi

IEEE

Implementation of Cloud 2016

ICCSP

In this present work, a Raspbian operating system based spy robot platform with remote monitoring and control algorithm through IoT has been developed which will save human live, reduces manual error and protect the country from enemies. This paper presents cloud

ISSN:0975-887

Year

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 8

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Based Live Streaming for Surveillance

3.

Video Surveillance Robot 2016 Control using Smartphone and Raspberry Pi

ICCSP

4.

Remote Control Robot 2014 Using Android Mobile Device

ICCC

5.

A Model for Remote 2014 Controlled Mobile Robotic over Wi-Fi Network Using Arduino Technology

ICFCNA

5. PROPOSED SYSTEM We proposed a system to build a realtime live streaming and monitoring system using Raspberry pi with installed Wi-Fi connectivity. In monitoring phase, the pi will record the video of the location in real-time. Capturing video is done through commands given through the computer to the raspberry pi. This command will be communicated to the pi using Wi-Fi. The pi camera is being used which will give a very good quality of the picture in the video. The connection of Raspberry pi with the motor driver is done using the General Purpose Input Output (GPIO) pins of Raspberry Pi. The GPIO pins are connected to the input pins of the motor shield. The output pins of the motor shield are connected to the motors. [4] Motor driver IC allows DC motor to run in either clockwise or anticlockwise direction. L293D works on H-Bridge principle. ISSN:0975-887

based surveillance system for live video streaming that can be surveillance from anywhere and anytime. This paper proposes a method for controlling a wireless robot for surveillance using an application built on Android platform. The paper describes the design and realization of the mobile application for the Android operating system which is focused on manual control of mobile robot using wireless Bluetooth technology. A camera ―eye of robot‖ captures and transmits images/videos to the operator who can then recognize the surrounding environment and remotely control the module.

There are two H-Bridges in IC. There are four input pins, each of two pins control a single DC motor. By changing the logic level on two pins like ―0 and 1‖ or ―1 and 0‖ the motor rotation direction has been controlled. A portable charger of 2 amp current is connected to the motor shield and raspberry pi. Once the connections are done properly the raspberry pi is ready to boot up. A Python program is written for controlling the motors wherein the GPIO pins will give out the output from the raspberry pi to the motor shield. The robot movement is controlled through the directions mentioned on the web page created using Hypertext Markup Language (HTML) code and webpage Universal Resource Locator (URL) address. This process is communicated through Wi-Fi to he Raspberry Pi model B. The camera module is installed into its port and it is enabled in raspberry pi settings. For the

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 9

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Live Streaming of videos, MJPEG streamer is installed and configured. After the configuration steps are done just view the live streaming in the app as well as the website. The website has been developed to allow a large number of people to experience the live streaming irrespective of their location. Here admin rights are given to authenticate the visibility of critical information by only authentic users.

5. If Result is equal to ‗L‘ Move robot LEFT 6. If Result is equal to ‗S‘ Robot STOP 7. If Gas leakage is detected by gas sensor Send alert message to mentioned mobile number. (The live streaming will directly process from terminal command.) 5.4 FLOWCHART

5.1 ARCHITECTURE

Fig 1: Architecture Fig 2: Flow Chart

5.2 MATHEMATICAL MODEL The Mathematical model for this system is as follows:Input={in1,in2,in3,in4) Forward={in1=1,in2=0,in3=1,in4=0) Backward(in1=0,in2=1,in3=0,in4=1) Right(in1=1,in2=0,in3=0,in4=0) Left(in1=0, in2=0, in3=1, in4=0) Stop(in1=0, in2=0, in3=0, in4=0) Where in1 & in2 denotes input of left motor Where in3 & in4 denotes input of right motor 5.3 ALGORITHM 1. Result = get data from firebase database 2. If Result is equal to ‗F‘ Move robot FORWARD 3. If Result is equal to ‗B‘ Move robot BACKWARD 4. If Result is equal to ‗R‘ Move robot RIGHT ISSN:0975-887

CONCLUSION The smart supervisor system we have built surveillance and real time video streaming system in which authentication is required to access the smart supervisor system. The smart supervisor system displaying the gas sensor value. This message is based on the response received from the smart supervisor system server & Smart phone. Whenever the gas leakage is detected, a mail is going to be sent to the registered mobile number. If correct IP address is provided, the app will proceed to display the various device operations & video streaming operations. According to the instructions provided by the app on our android mobile we can operate the movement of the robot. The robot can move in forward, backward, left and right direction. The command used for live streaming is as follows:- raspivid -o - -t 0 6.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 10

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

vf -hf -fps 10 -b 500000 | ffmpeg -re -ar 44100 -ac 2 -acodec pcm_s16le -f s16le ac 2 -i/dev/zero -f h264 -i - -vcodec copy acodec aac -ab 128k -g 50 -strict experimental -f flvrtmp://a.rtmp.youtube.com/live2/j1s8d349-9536-8d6r. [2] Surveillance system is available with various features. Selection is based on various factors such as cost, video quality etc. Proposed system is cost effective as well as user friendly. It has application in different fields like military, defenses, house, office and environment monitoring. System can be enhanced by using face detection and recognition to follow a particular person like children below 4 years so that they are continuously in front of our eyes. 7. FUTURE SCOPE 1. Major improvements on the system processor speed are much needed in order to process large files e.g. video for effective motion detection and tracking. 2. The designed security system can be used in homes to monitor the facility at any given time. 3. The system requires to be remotely controlled. Hence, future explorations should focus much more on the same. REFERENCES

ISSN:0975-887

[1] R, H., & Safwat Hussain, M. H. (2018). Surveillance Robot Using Raspberry Pi and IoT. 2018 International Conference on Design Innovations for 3Cs Compute CommunicateControl(ICDI3C).doi:10.1109/ic di3c.2018.00018 [2] Oza, N., & Gohil, N. B. (2016). Implementation of cloud based live streaming for surveillance. 2016 International Conference on Communication and Signal Processing (ICCSP). doi:10.1109/iccsp.2016.7754297 [3] Nadvornik, J., & Smutny, P. (2014). Remote control robot using Android mobile device. Proceedings of the 2014 15th International Carpathian Control Conference ICCC).doi:10.1109/carpathiancc.2014.684363 0. [4] Bokade, A. U., & Ratnaparkhe, V. R. (2016). Video surveillance robot control using smartphone andRaspberry pi. 2016 International Conference on Communication and Signal Processing (ICCSP). doi:10.1109/iccsp.2016.7754547 [5] Aneiba, A., & Hormos, K. (2014). A Model for Remote Controlled Mobile Robotic over WiFi Network Using Arduino Technology. International Conference on Frontiers of Communications, Networks and Applications (ICFCNA 2014Malaysia). doi: 10.1049/cp.2014.1429. [6] Abdalla, G. O. E., & Veeramanikandasamy, T. (2017). Implementation of spy robot for a surveillance system using Internet protocol of Raspberry Pi. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology(RTEICT).doi:10.1109/rteict.2017. 8256563.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 11

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SMART DUSTBIN WITH METAL DETECTOR

Dhiraj Jain1, Vaidehi Kale2, Raksha Sisodiya3, Sujata Mahajan4, Dr. Mrs. Gitanjali R. Shinde5 1,2,3,4,5

Computer Department, SKNCOE, Pune,India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT In the past few decades there is a rapid increase in urbanization. So, management of waste is one of the issues we are facing nowadays. As India is a developing nation, the important challenge is turning our nation's cities into smart cities. Swachh Bharat Mission, is an urban renewal and retrofitting program by the government of India with the mission to develop 100 cities across the country making them citizen friendly and sustainable. For making this possible we need smart cities with smart streets enabled with smart garbage monitoring system. The aim of the mission is to cover all the rural and urban areas of the country to present this country as an ideal country before the world. In this proposed system, multiple dustbins from the different areas throughout the cities are connected using IOT technology. The dustbin uses low cost embedded devices and it will sense the level of dustbin, then it is sent to the municipality officer. Smart bin is built on Arduino Uno board which is interfaced with GSM modem, Ultrasonic sensor and Metal detector. Ultrasonic sensor is placed at the top of the dustbin which will measure the status of the dustbin and metal detector will prevent metal from getting mixed with the garbage. Arduino will be programmed in such a way that when the dustbin is being filled, the remaining height from the threshold height will be displayed. Once the garbage reaches the threshold level ultrasonic sensor will trigger the GSM modem which will continuously alert the required authority. Also, metal detector give alert to indicate that garbage contains metal. Keywords GSM (Global System for Mobile communication); IOT (Internet of Things); LED (Light Emitting Diode); ILP (Integer Linear Programming); IoT; Smart city; Smart Garbage Dustbins; Arduino; Ultrasonic Sensors; transceivers for digital communication that 1. INTRODUCTION The main aim of this project is to reduce will be able to communicate with one human resources and efforts along with the another. [1] There is a rapid growth in enhancement of a smart city vision. urbanization and modernization. With respect to urbanization, we must have Garbage Monitoring System and Metal sustainable urban development future Detection: Garbage may consist of the unwanted plans. To achieve this, we propose smart material surplus from the City, Public area, dustbins with metal detector. Our proposed Society, College, home etc., due to these project is based on IOT, refers to wireless wastes there will be poisonous gases network between objects. The internet of emitting from them which is harmful for things helps us make dustbins that can be the nearby residents which leads to severe easily sensed and remotely accessed and diseases. This survey is related to the controlled from the internet. Here we get "Smart garbage monitoring system using real time information of dustbins. [3] The internet of things". So, for a smart main problem in the current waste lifestyle, cleanliness is crucial. This helps management system in most of the Indian us to eradicate the garbage disposal cities is the unhealthy status of dustbins. In problem using Internet of Things (IoT) in this project, we have tried to upgrade the which this is done using microcontrollers, trivial but vital component of the urban ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 12

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

waste management system, i.e. dustbin. The main focus of our project is to create an automatic waste management system across the whole city and monitoring by a single system efficiently and separate the metal in the garbage at its origin to reduce the separation of metals and garbage at the dumping place. It will also help to reduce the cost of separation of metals and garbage. This can prove to be a new revolution in the smart city implementation. 2. MOTIVATION These malodorous rotten wastes that remain untreated for a long time, due to negligence of authorities and carelessness of public may lead to long term problems. Breeding of insects and mosquitoes can cause dreadful diseases. Also, the garbage has various metals that can be recycled which is separated from garbage at the dumping place at its cost separation is high. Garbage also contains many types metals like Tin Can, Metal container etc. this increase the cost of metal separation and garbage at the dumping place. 3. LITERATURE SURVEY [1] Dharna Kaushik Sumit Yadav in ―Multipurpose Street-Smart Garbage bin based on Iot‖ proposed system has included, there are multiple smart garbage trash bins on a microcontroller board platform (Arduino Board) located throughout any city or the campus or hospital. The Arduino Board is interfaced with GSM modem and ultrasonic sensor. Once the level of threshold is being crossed, then ultrasonic sensors will trigger the GSM module which in turn continuously alert the authorized person by sending SMS reminder after until the dustbin is cleaned. Beside this, we will also create the central system that will keep showing us the current status of garbage on mobile web browser with html page by wi-fi. With the help of this, we will create shortest path for garbage collection vehicles using Dijkstra ISSN:0975-887

Algorithm. This is real time waste management by using smart trash bins that can be accessed anytime anywhere by the concerned person. [2] Bikramjit Singh et al, Manpreet Kaur in ―Smart Dustbins for Smart Cities‖ has imposed that the garbage collection system has to be smarter and in addition to that the people need easy accessibility to the garbage disposing points and garbage collection process has to be efficient in terms of time and fuel cost. Paper has GPS and internet enabled Smart Dustbin, Garbage Collection and disposing, Garbage Collection Scheduling, Nearest Dustbin. [3] Ahmed Omara, Damla Gulen, ,Burak Kantarci and Sema F. Oktug in ―Trajectory-Assisted Municipal Agent Mobility A Sensor-Driven Smart Waste Management System‖ has proposed a WSN-driven system for smart waste management in urban areas. In proposed framework, the waste bins are equipped with sensors that continuously monitor the waste level and trigger alarms that are wirelessly communicated to a cloud platform to actuate the municipal agents, i.e., waste collection trucks. They formulate an Integer Linear Programming (ILP) model to find the best set of trajectory-truck with the objectives of minimum cost or minimum delay. In order for the trajectory assistance to work in real time, they propose three heuristics, one of which is a greedy one. Through simulations, they have shown that the ILP formulation can provide a baseline reference to the heuristics, whereas the non-greedy heuristics can significantly outperform the greedy approach regarding cost and delay under moderate waste accumulation scenarios. Minthu Ram Chiary, Sripathi SaiCharan, Abdul Rashath .R, Dhikhi .T in ―DUSTBIN MANAGEMENT SYSTEM USING IOT‖ has proposed a system, in their system the Smart dustbins are connected to the internet to get the real time information of the smart dustbins. In

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 13

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

the recent years, there was a rapid growth in population which leads to more waste disposal. So, a proper waste management system is necessary to avoid spreading many diseases by managing the smart bins by monitoring the status of it and accordingly taking the decision. There are multiple dustbins that are located in the city or the Campus (Educational Institutions, Companies, Hospitalet.). These dustbins are interfaced with micro controller-based system with Ultrasonic Sensors and Wi-Fi modules. Where the Ultrasonic sensor detects the level of the waste in dustbin and sends the signals to micro controller the same signal is encoded and send through Wi-Fi Modular (ESP8266) and it is received by the end user. The data will be sent to the user through E-Mail i.e., a mail will be sent as notification that the dustbin is full so that the municipality van can come and empty the dustbin. [5] N. Sathish Kumar, B. Vuayalakshmi et al, in ―IOT based smart garbage alert system using Arduino UNO‖ proposed a smart alert system for garbage clearance by giving an alert signal to the municipal web server for instant cleaning of dustbin with proper verification based on level of garbage filling. This process is aided by the ultrasonic sensor which is interfaced with Arduino UNO to check the level of garbage filled in the dustbin and sends the alert to the municipal web server once if garbage is filled. After cleaning the dustbin, the driver confirms the task of emptying the garbage with the aid of RFID Tag. RFID is a computing technology that is used for verification process and in addition, it also enhances the smart garbage alert system by providing automatic identification of garbage filled in the dustbin and sends the status of clean-up to the server affirming that the work is done. The whole process is upheld by an embedded module integrated with RF ID and IOT Facilitation. An Android application is developed and linked to a web server to intimate the alerts from the ISSN:0975-887

microcontroller to the urban office and to perform the remote monitoring of the cleaning process, done by the workers, thereby reducing the manual process of monitoring and verification. The notifications are sent to the Android application using Wi-Fi module. 4. GAP ANALYSIS Table: Gap Analysis

Systems Multipurpose Street-Smart Garbage bin based on IOT

Smart Dustbins for Smart Cities

Dustbin Management System Using IOT

TrajectoryAssisted Municipal Agent Mobility A SensorDriven Smart Waste Management System

Benefits Continuously alert the authorized person by sending SMS reminder. Provides location on nearest dustbin for disposing garbage. Micro controllerbased system with Ultrasonic Sensors and Wi-Fi modules waste collection trucks formulate an Integer Linear Programming (ILP) model to find the best set of trajectorytruck with the objectives of minimum cost or minimum delay.

Limitations Access to status is on web browser as html page, there is no application. Garbage collection scheduling is done when many of the dustbins are full. The status of the dustbin will be sent to the user through EMail

It has no metal detector to detect metal.

5. PROPOSED WORK A. System Architecture

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 14

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Fig:System Architecture

System architecture includes the modules used in the project and relationships between them based on data flow and processing. The System consists of following components:

 Dustbin  LED  Metal Detector  Ultrasonic Sensor  Arduino Board  GSM Module  User Interface Arduino Uno board is interfaced with GSM modem, Ultrasonic sensor and metal detector. When waste is being dumped into the dustbin the metal detector detects whether the waste contains metal or not. If there is any metal present then it gives an alert. Ultrasonic sensor is placed at the top of the dustbin which will measure the stature of the dustbin. The threshold stature is set as 10cm. Arduino will be programmed in such a way that when the dustbin is being filled, the remaining height from the threshold height will be displayed. Once the garbage reaches the threshold level ultrasonic sensor will trigger the GSM modem which will continuously alert the required authority. GSM modem sends the data of the dustbin to the concerned authority.

B. Arduino and GSM Module Interface

Fig: Module Interface ISSN:0975-887

Global System for Mobile communication (GSM) is digital cellular system used for mobile devices. It is an international standard for mobile which is widely used for long distance communication. There are various GSM modules available in market like SIM900, SIM700, SIM800, SIM808, SIM5320 etc. SIM900A module allows users to send/receive data over GPRS, send/receive SMS and make/receive voice calls. Connecting GSM modem with Arduino is very simple just connect RX Line of Arduino to TX Line of GSM Modem and vice versa TX of Arduino to Rx of GSM modem. Make sure use TTL RX, TX lines of GSM modem. Give 12V 2Amp power supply to GSM modem, use of less current power supply can cause reset problem in GSM modem, give sufficient current to GSM modem. C. Metal Detector Using Arduino Model

Fig: Metal Detector Using Arduino Model

A LED and Buzzer is used for metal detection indicator. A Coil and capacitor are used for detection of metals. A signal diode is also used to reduce the voltage. And a resistor for limiting the current to the Arduino pin. Working of this Arduino Metal Detector is bit tricky. The block wave or pulse is provided which is generated by Arduino, to the LR high pass filter. Due to this, short spikes will be generated by the coil in every transition. The pulse length of the generated spikes is proportional to the inductance of the coil. So, with the help of these Spike pulses we can measure the inductance of Coil. A capacitor is used which is charged by the rising pulse or spike. And it required few pulses to charge the capacitor to the point

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 15

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

where its voltage can be read by Arduino analog pin A5. D. Mathematical Model Server collects the fill up status and location of dustbins. It processes the clients query and it respond with nearest dustbin location and with direction to access dustbin. C - current fill up status T - time duration between generation of wave and wave received by receiver S - the speed of light. And we will calculate the value of C using formula given below C=L-(ST)/2 And similarly, percentage of fill up is calculated using formula given below P=(C/L) *100 Where P is the % fill up Here we are assuming the wave path is almost vertical. 6. CONCLUSION AND FUTURE WORK This project was developed with the intention of making smart cities; however, there are lots of scope to improve the performance of the Proposed System in the area of User Interface, adding new features and query processing time. Etc. So, there are many things for future enhancement of this project. The future enhancements that are possible in the project are as follows: If the system is sponsored then we can have additional sensors for wet and dry waste segregation. REFERENCES [1] Dharna Kaushik Computer Science and Engineering Indira Gandhi Delhi Technical

ISSN:0975-887

University for Women Delhi, India and Sumit Yadav Computer Science and Engineering Indira Gandhi Technical University for Women Delhi, India, ―Multipurpose StreetSmart Garbage bin based on Iot‖ Volume 8, No. 3, March – April 2017. [2] Bikramjit Singh, Manpreet Kaur – ―Smart Dustbins for Smart Cities‖ Bikramjit Singh et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 7 (2), 2016, 610-611 [3] Ahmed Omara, Damla Gulen, Burak Kantarci and Sema F. Oktug – ―Trajectory-Assisted Municipal Agent Mobility: A Sensor-Driven Smart Waste Management System‖, Published: 21 July 2018 [4] Minthu Ram Chiary, Sripathi SaiCharan, Abdul Rashath. R, Dhikhi. T Computer Science and Engineering Saveetha school of Engineering Saveetha University - ―DUSTBIN MANAGEMENT SYSTEM USING IOT‖ Volume 115 No. 8 2017, 463-468 ISSN: 13118080 [5] N. Sathish Kumar, B. Vuayalakshmi, R. Jenifer Prarthana, A. Shankar, Sri Ramakrishna Engineering College, Coimbatore, TamilNadu, India for ――IOT based smart garbage alert system using Arduino UNO‖ IEEE 978-1-5090-2597-8 [6] Narayan Sharma, Nirman Singha, Tanmoy Dutta, ―Smart Bin Implementation for Smart Cities‖, International Journal of Scientific & Engineering Research, Volume 6, Issue 9, September-2015 ISSN 2229-5518. [7] ―Smart Cities‖ available at www.smartcities.gov.in/ [8] ―GSM MODULE INTERFACE‖ at https://circuits4you.com/2016/06/15/gsmmodem-interfacing-arduino/ [9] ―GSM‖ https://www.arduino.cc/en/Guide/ArduinoGS MShield [10] ―GSM Module‖ http://www.circuitstoday.com/interface-gsmmodule-with-arduino [11] ―Arduino‖ https://www.arduino.cc/ [12] ―Android‖https://developer.android.com/studio /―GSM MODULE‖ www.electronicwings.com/arduino/sim900agsm-module-interfacingwith-arduino-uno.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 16

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

IMPROVEMENT IN PERSONAL ASSISTANT Ashik Raj1, Sreeja Singh2, Deepak Kumar3, Deshpande Shivani Shripad4 1,2,3,4

Department of Computer Engineering Smt. Kashibai Navale College of Engineering, Vadgaon bk Pune, India. [email protected], [email protected], [email protected], [email protected] 4

ABSTRACT In this paper, we describe the Artificial Intelligence technologies are beginning to be actively used in human life, this is facilitated by the appearance and wide dissemination of the Internet of Things (IoT). Autonomous devices are becoming smarter in their way to interact with both a human and themselves. New capacities lead to creation of various systems for integration of smart things into Social Networks of the Internet of Things. One of the relevant trends in artificial intelligence is the technology of recognizing the natural language of a human. New insights in this topic can lead to new means of natural human-machine interaction, in which the machine would learn how to understand human‘s language. Keywords Virtual Personal Assistants; Multi-modal Dialogue Systems; Gesture Recognition; Image Recognition; Image Recognition; Intrusion detection image/video recognition, speech 1. INTRODUCTION Today the development of artificial recognition, the vast dialogue and intelligence (AI) systems that are able to conversational knowledge base, and the organize a natural human-machine general knowledge base. Moreover, our interaction (through voice, approach will be used in different tasks communication, gestures, facial including education assistance, medical expressions, etc.) are gaining in popularity. assistance, robotics and vehicles, Machine learns to communicate with a disabilities systems, home automation, and human, exploring his actions, habits, security access control. behavior and trying to become his personalized assistant. 2. GENERAL TERM The work on creating and improving such The dialogue system is one of an active personalized assistants has been going on area that many companies use to design for a long time. These systems are and improve their new systems. constantly improving and improving, go According to CHM Research, before 2030, beyond personal computer. Spoken millions of us will be using ―voice‖ to dialogue systems are intelligent agents that interact with machine, and voice-driven are able to help users finish tasks more services will become part and parcel of efficiently via spoken interactions. Also, smartphones, smart glasses, home hubs, spoken dialogue systems are being kitchen equipment, TVs, games consoles, incorporated into various devices such as thermostats, in-car systems and apparel. smart-phones, smart TVs, in car navigating There are many techniques used to design system. the dialogue systems, based on the In this proposal, we propose an approach application and its complexity. On the that will be used to design the Next- basis of method used to control dialogue, a Generation of Virtual Personal Assistants, dialogue system can be classified in three increasing the interaction between users categories: Finite State (or graph) based and the computers by using the Multi- systems, Frame based system and Agent modal dialogue system with techniques based systems. including the gesture recognition, ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 17

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Also, there are many different architectures for dialog systems. Which sets of components are included in a dialog system, and how those components divide up responsibilities differs from system to system. A dialogue system has mainly seven components: Input Decoder, Natural Language Understanding, Dialogue Manager, Domain Specific Component, Response Generator, and Output Renderer. However, there are six main components in the general dialogue systems, which includes the Speech Recognition (ASR), the Spoken Language Understanding (SLU), Dialog Manager (DM), Natural Language Generation (NLG), Text to Speech Synthesis (TTS), and the knowledge base. The following is the structure of the general dialogue system 3. THE PROPOSAL VPASS SYSTEM 1.1 In this proposal, we have used the multi-modal dialogue systems which process two or more combined user input modes, such as speech, image, video, touch, manual gestures, gaze, and head and body movement in order to design the Next-Generation of VPAs model. We have modified and added some components in the original structure of general dialogue systems, such as ASR Model, Gesture Model, Graph Model, Interaction Model, User Model, Input Model, Output Model, Inference Engine, Cloud Servers and Knowledge Base. The following is the structure of the Next-Generation of Virtual Personal Assistants: this model includes intelligence algorithms to organize the input information before sending the data to the Interaction Model. Knowledge Base: There are two knowledge bases. The first is the online and the second is local knowledge base which include all data and facts based on each model, such as facial and body data sets. There are two knowledge bases. The first is the online and the second is local knowledge base which include all data and facts based on each model, such as facial ISSN:0975-887

and body data sets for gesture modal, speech recognition knowledge bases, dictionary and spoken dialog knowledge base for ASR modal, video and image body data sets for Graph Model, and some user‘s information and the setting system. B. Graph Model The Graph Model analyzes video and image in real-time by using the Graph Model and extracts frames of the video that collect by the camera and the input model; then it sends those frames and images to the Graph Model and applications in Cloud Servers for analyzing those frames and images and returning the result. 1.2 Comparison on features of popular VPA in market 1.3

Fig: Gap Analysis

Fig 1: Block diagram of system architecture

Competition: Google Now:

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 18

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Launched in 2012, Google Now is an intelligent personal assistant made by Google. It was first included in Android 4.1 which launched on July 9, 2012, and was first supported on the Google Nexus smart-phone. Found within the Google search option, Google Now can be used in numerous ways that are helpful. Yes, it can set reminders or answer basic questions like the weather of the day or the name of the movies that won Oscars last year. But more than that Google Now is a virtual assistant that shows relevant and timely information to you once it learns more about you and how you use the phone. Google Now also displays different sections called Now cards that pulls information from your Gmail account and throws it on the screen. For example if you have last bought a Red Bag from Amazon, the card shows you your recent buy. Similarly, it also has weather card where you can know about the weather, sport card where you can learn about any match that is on. Amazon Alexa: Amazon Alexa, known simply as Alexa is a virtual assistant developed by Amazon, first used in the Amazon Echo and the Amazon Echo Dot smart speakers developed by Amazon Lab126. It is capable of voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, sports, and other real-time information, such as news. Alexa can also control several smart devices using itself as a home automation system. Users are able to extend the Alexa capabilities by installing "skills" (additional functionality developed by third-party vendors, in other settings more commonly called apps such as weather programs and audio features). Cortana: Cortana is the name of the interactive personal assistant built into Windows 10. You can give her instructions and talk with

ISSN:0975-887

her by using your voice or by typing. You can give her instructions and talk with her by using your voice or by typing Cortana, named after her fictional counterpart in the video game series Halo, takes notes, dictates messages and offers up calendar alerts and reminders. But her real standout characteristic and the one Microsoft's betting heavily on, is the ability to strike up casual conversations with users; what Microsoft calls "chitchat". 4. CONCLUSION In this paper we have seen the working of personal virtual assistant by using Natural language Processing and Internet of Things and also seen the implementation of intrusion detection system with the help of passive infrared sensor PIR for detecting the motion. REFERENCES [1] S. Arora, K. Batra, and S. Singh. Dialogue System: A Brief Review. Punjab Technical University. [2] Ding, W. and Marchionini, G. 1997 A Study on Video Browsing Strategies. Technical Report. University of Maryland at College Park. [3] R. Mead. 2017. Semio: Developing a Cloud-based Platform for Multimodal Conversational AI in Social Robotics. 2017 IEEE International Conference on Consumer Electronics (ICCE). [4] R. Pieraccini, K. Dayanidhi, J. Bloom, J. Dahan, M.l Phillips. 2003. A Multimodal Conversational Interface for a Concept Vehicle. Eurospeech 2003. [5] G. Bohouta and V. Z Këpuska. 2017. Comparing Speech Recognition Systems (Microsoft API, Google API and CMU Sphinx). Int. Journal of Engineering Research [6] M. McTear .2016. The Dawn of the Conversational Interface. Springer International Publishing Switzerland 2016 [7] Amazon. Amazon Lex is a service for building conversational interfaces. https://aws.amazon.com. [8] B. Marr. The Amazing Ways Google Uses Deep Learning AI. https://www.forbes.com [9] K. Wagner. Facebook's Virtual Assistant 'M' Is Super Smart. It's Also Probably a Human. https://www.recode.com.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 19

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

IoT BASED HOME AUTOMATION SYSTEM FOR SENIOR CITIZENS 1,2,3

Ashwathi Sreekumar1, Divyanshi Shah2, Himanshi Varshney3

Dept. of Computer Engineering, Smt. Kashibai Navale College of Engineering, Savitribai Phule Pune University, Pune, India. [email protected], [email protected], [email protected]

ABSTRACT Smart homes promise to make the lives of senior citizens of our society more comfortable and safer. However, the goal has often been to develop new services for young people rather than assisting old people to improve their quality of life. Important is, the potential for using these technologies to promote safety and prevent injury among old people because this group is at home more than the other age groups. Network devices can collect data from sensors and can instruct and remind individuals about safety-related issues. The work focuses on concept of home automation where the monitoring and control operations are facilitating through smart devices installed at homes. Keywords IoT, Smart Home, Security, Raspberry pi, remote sensor, relay, WI-FI, Mobile phone, Home Automation for elderly, Emergency support. Some of the monitoring or safety devices 1. INTRODUCTION Nowadays, many of the daily activities are that can be installed in a home include automated with the rapid enhancement of lighting and motion sensors, the electronic devices. Automation is a environmental controls, video cameras, technique, method, or system of operating automated timers, emergency assistance or controlling a process by electronic systems, and alerts. In order to maintain devices with reducing human involvement the security of the home many home to a minimum. The fundamental of automation systems integrate features such building an automation system for an as remote keyless entry systems which will office or home is increasing day-by-day allow seniors to view who is at the door with numerous benefits. While there are and then remotely open the door. Home many industrial facilities that are using networks can also be programmed to automation systems and are almost fully automatically lock doors in order to automated, in the other hand, the home maintain privacy. In simple installations, automation systems are rarely used in the automation may be as straightforward as houses of common people, which is turning on the lights when a person enters mainly because of the high cost of these the room. In advanced installations, rooms kind of systems. The form of home can sense not only the presence of a person automation focuses on making it possible inside but know who that person is and for older adults to remain at home, safe perhaps set appropriate lighting, and comfortable rather than move to a temperature, music levels or Television healthcare facility. This project focuses channels, taking into account the day of more on that home automation can make a the week, the time of day, and other difference regarding better energy factors. Also, to Design a remotely management and usage of renewable controlled multifunction outlet handled energy sources but tailors it towards older using google assistant and other sources. adults. Home automation for healthcare The request will be sent to the designated can range from very simple alerts to lavish device via Wi-Fi. computer-controlled network interfaces. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 20

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

2. MOTIVATION What happens when our senior loved ones still want to live independently at home, but we worry about them? What if we had a smart home system that could provide information on an ageing loved one – and give some peace of mind? With the advancement in Technology, living life has become a lot easier from the materialism point of view. Seeing the elderly go by their daily routine, aloof to the advancements, we were motivated to find a solution to help them in overcoming these difficulties, so that they can live independently and securely in their current home for as long as possible, thus giving family members peace of mind, called ‗ageing in place‘. 3. LITERATURE REVIEW IoT Based Smart Security and Home Automation System.(2016) The paper is written by Ravi Kishore Kodali, Vishal Jain, Suvadeep Bose and Lakshmi Boppana. The current system sends alerts to the owner over voice calls using the Internet if any sort of human movement is sensed near the entrance of his house. Microcontroller used is TICC3200 Launchpad board which comes with an embedded micro-controller and an onboard Wi-Fi Shield. Advantages of the system proposed are that it is a low-cost system with minimum requirements, platform independent and phone need not necessarily be connected to internet. IoT: Secured and Automated House (2017) The paper is presented by Hakar Mohsin Saber, Nawzad Kamaran Al-Salihi. The system uses Arduino with Teleduino web server and an Android application. It also uses a cloud webserver. The advantage to this system was that it sends an SMS alert to the user using a cloud server API to make it cost effective. The disadvantage was Limited memory due to usage of SIM card and the application sends a 25message clear signal before sending the alert. ISSN:0975-887

Dual Tone Multi Frequency based Home Automation System (2017) Authors to the paper include Rucha R. Jogdand, B. N. Choudhari. Dual-tonemulti-frequency are the audible sounds you hear when you press keys on your phone. It is paired with a wireless module. When a button is pressed from mobile it generates a tone which is decoded by the decoder IC and it is sent to ATMEGA8 controller. The main advantage is that it can have both wired and wireless communication. Also, frequencies are more practical and less expensive to implement. The drawbacks to the system was that Number of appliances is limited as our mobile can generate only 16 tones. 4. PROPOSED WORK This project focuses on the helping the technology provide easier and safe living for the elderly. The system uses various sensors to either ensure or safety for the elderly. We use a mobile application to send the commands to the cloud over a WIFI based system. On the cloud these commands are interpreted and the necessary action are taken by the actuators or the requests are delivered with the responses. The mobile application also has a situation to handle emergencies like a medical emergency which would call the ambulance and a security emergency which would alert the police. Assumptions and Dependencies Assuming that the user has a stable internet connection at home, user also has basic knowledge about smart phones. The devices should always be connected to internet. User should have an android operating smart phone. Proper hardware components should be available. Requirements Functional requirements denote the functions that a developer must build into the software to achieve use-cases. For the proposed system the functional requirements are Switching Devices On and Off, Door Lock Down, Select Room to Monitor, View Status of Devices at Home,

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 21

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Instant Capture of Emergency Handling.

Image,

Medical

5. SYSTEM ARCHITECTURE

Fig.2. Data flow diagram level 0

Fig. 1. System Architecture

The Fig 1 above shows the architecture of the proposed system. The sensors and actuators are connected and given power supply by the raspberry pi. The raspberry pi provides the support to send requests and receive responses form the cloud. The cloud via the IFTTT services sends message to the mobile application. The mobile application responds to the raspberry pi using the cloud. The user uses the android application created using the MIT App Inventor to as an interface to the cloud in order to give the commands.

Fig.2. Steps Involved

The above steps shown in the Fig 2 are the basic ones required in the project. In the Assembly of Hardware phase, we are going to assemble all the hardware which includes set of sensors and actuators and set up a connection between the raspberry pi, breadboard and the devices. In the Services phase we are using Amazon Web Services IoT, IFTTT, MIT APP Inventor. The Application phase is based on creating a elder friendly interface.

ISSN:0975-887

Fig.2. Data flow diagram level 1

The Fig, 3 and 4 shows the data flow diagram of the model. It shows the graphical representation of flow of data through an information system. It also shows the preliminary step to create an overview of the system. DFD level 0 show Android Application taking voice commands and giving it to the raspberry pi which then takes the action. The DFD level 1 is a more detailed view of the level 0. Here we show what type of commands are sent via the user to the raspberry pi and ultimately to the Application. Algorithm The algorithm which is to be used for the light and fan would read the variable value from the button on the android application and accordingly actuate the necessary action i.e. either to switch ON or OFF the device or to set the intensity/speed value.

Fig.3. Algorithm for Light and Fan

For the working of the door, the proposed algorithm requires for the system to be under lockdown. Under such circumstances if the door is opened by any means it would set off the alarm and also

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 22

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

send the notification to the necessary people.

Fig.4. Algorithm for Door

Other Specifications Advantages Remotely monitoring Home Appliances: home appliances such as lights, fans, door etc. can be monitored easily with the help of an Android application and/or voice recognition tool. Security for the senior citizens: This application also provides the facility of intrusion detection along with door lock down whenever required. Also, the feature of instant face capture is being provided to help in detecting the identity of the intruder. Medical Emergency Call/SMS: An extra button for ―Emergency‖ is provided on the application to make a call or send SMS to the immediate emergency contact or to the Hospital in case of medical emergencies. Energy Management: Providing the app with visual aids and syncing them to the current status of the remote device. Providing the feature of instant capture to avoid worrying over suspicious activities and help in clearing paranoia. Limitations Auto-detecting Medical Emergency: In case of medical emergency, the elderly people have to make a move on their own like either making use of voice recognition tool or open the particular android application and select the ―Emergency‖ button, the system cannot detect an emergency on its own. ISSN:0975-887

6. CONCLUSION AND FUTURE SCOPE The main objective of our project is to make life easier for senior citizen by introducing a cost-efficient system that can connect appliances remotely. The communication link between the appliances and remote user plays an important role in automation. This project includes voice-controlled home automation system which involves the speech recognition system to implement this work. This is used to remotely control the home appliances through smart devices so that one can remotely check the status of the home appliances and turn ON or OFF the same. Also, one can keep a track of the security of their valuables whenever required. In advanced installations, rooms can sense not only the presence of a person inside but know who that person is and perhaps set appropriate lighting, temperature, music levels or Television channels, taking into account the day of the week, the time of day, and other factors in future .The future of IoT is virtually unlimited due to advances in technology and consumers‘ desire to integrate devices such as smart phones with household machines. The possibilities are exciting, productivity will increase and amazing things will come by connecting the world. REFERENCES [1]

[2]

[3]

[4]

Ravi Kishore Kodali, Vishal Jain, Suvadeep Bose and Lakshmi Boppana, ―IoT Based Smart Security and Home Automation System‖, International Conference on Computing, Communication and Automation 2016. Hakar Mohsin Saber, Nawzad Kamaran AlSalihi, ―IoT: Secured and Automated House‖, International Journal of Engineering Science and Computing 2017. Rucha R. Jogdand, B. N. Choudhari, ―Dual Tone Multi Frequency based Home Automation System‖, IEEE 2017. Prof. R.S. Suryavanshi1, Kunal Khivensara, Gulam Hussain, Nitish Bansal, Vikash Kumar,‖ Home automation system using

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 23

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[5]

[6]

android and Wi-Fi‖, International Journal of Engineering and Computer Science 2014. B. R. Pavithra, D., ―Iot based monitoring and control system for home automation,‖ April 2015. B. S. S. Tharaniya soundhari, M., ―Intelligent interface-based speech

ISSN:0975-887

[7] [8]

recognition for home automation using android application,‖ March 2015. R. A. Ramlee, M. A. Othman, M. H. Leong, M. M. Ismail and S. S. S. Ranjit, "Smart home system using android application‖, international Conference of information and Communication Technology 2013.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 24

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SMART TRAFIC CONTROL SYSTEM USING TIME MANAGEMENT Gaikwad Kavita Pitambar1, More Sunita Vitthal2, Nalge Bhagyashree Muktaji3 1,2,3

Computer Engineering, SCSMCOE, Nepti, Ahmednagar, India. [email protected], [email protected], [email protected]

ABSTRACT An automated Raspberry Pi based traffic control system using sensors along with live web updates can be a helpful step in optimizing the traffic flow pattern in busy intersections. This intuitive design of the transport infrastructure can help alleviate the traffic congestion problem in crowded cities. This system describes a system where photoelectric sensors are integrated with the Raspberry Pi to operate the lanes of an intersection based on the density of traffic. The current condition of the intersection is updated on a user accessible website. In this system, we will use photoelectric sensors to measure the traffic density. We have to mount four photoelectric sensors for each road; the distance between these sensors will depend on nature of traffic on a particular junction. These sensors will sense the traffic on that particular road. As a result, the improvement in traffic system can be incrementally enhanced, which can lead to eventually significant improvement in the overall traffic system. General Terms Your general terms must be any term which can be used for general classification of the submitted material such as Pattern Recognition, Security, Algorithms et. al. Keywords smart traffic control system; Raspberry pi; photoeletric sensor; traffic congestion. rage. In order to avoid the congestion in 1. INTRODUCTION In modern life we have to face the traffic. In traffic environments, Traffic with many problems one of which is traffic Sign Recognition (TSR) is used to regulate congestion becoming more serious day traffic signs, warn the driver, and after day. It is said that the high volume of command or prohibit certain actions. A vehicles, the inadequate infrastructure and fast real-time and robust automatic traffic the irrational distribution of the sign detection and recognition can support development are main reasons for and disburden the driver, and thus, increasing traffic jam. The major cause significantly increase driving safety and leading to traffic congestion is the high comfort. number of vehicle which was caused by Generally, traffic signs provide the the population and the development of driver various information for safe and economy. Traffic congestion is a condition efficient navigation Automatic recognition on road networks that occurs as use of traffic signs is, therefore, important for increases, and is characterized by slower automated intelligent driving vehicle or speeds, longer trip times, and increased driver assistance systems. However, vehicular queuing. The most common identification of traffic signs with respect example is the physical use of roads by to various natural background viewing vehicles. When traffic demand is great conditions still remains challenging tasks. enough that the interaction between Real time automatic vision based traffic vehicles slows the speed of the traffic light control has been recently the interest stream, these results in some congestion. of many researchers, due to the frequent known as a traffic jam or traffic snarl-up. traffic jams at major junctions and its Traffic congestion can lead to drivers resulting wastage of time. Instead of becoming frustrated and engaging in road depending on information generated by ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 25

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

costly sensors, economic situation calls for using available video cameras in an efficient way for effective traffic congestion estimation. Researchers may focus on one or more of these tasks, and they may also choose different measures for traffic structure or add measures. For more comprehensive review on vision based traffic light control Due to the massive growth in urbanization and traffic congestion, intelligent vision based traffic light controller is needed to reduce the traffic delay and travel time especially in developing countries as the current automatic time based control is not realistic while sensor based traffic light controller is not reliable in developing countries. Traffic congestion is now considered to be one of the biggest problems in the urban environments. Traffic problems will be also much more widely increasing as an expected result of the growing number of transportation means and current low-quality infrastructure of the roads. In addition, many studies and statistics were generated in developing countries that proved that most of the road accidents are because of the very narrow roads and because of the destructive increase in the transportation means. A Raspberry Pi microcomputer and multiple ultrasonic sensors are used in each lane to calculate the density of traffic and operate the lane based on that calculation. This idea of controlling the traffic light efficiently in real time has attracted many researchers to work in this field with the goal of creating automatic tool that can estimate the traffic congestion and based on this Variable, the traffic sign time interval is forecasted. 2. WORKING In this proposed system supply given to the step-down transformer. The output of the transformer is connected to the input to the full wave bridge rectifier. The output of bridge rectifier is given to the Regulator. The output of regulator gives ISSN:0975-887

+5 positive supply which is given to the whole electronic component of the system. The Raspberry Pi uses this information to set the signal timer according to the level of traffic. 3. BLOCK DIAGRAM

Fig. 1 Block Diagram

16*2 alpha-numeric LCD display is used which shows the real time information about Traffic signal. Here use to four sensor when any sensor sense then this signal go to the Raspberry pi and Raspberry pi output go the relay driver and relay is ON at that time LED is ON and also LCD display the time. 4. SYSTEM DESIGN

Fig. shows the overall design of the system. In this intersection, each outgoing lane has four photoelectric sensors that calculate and report the traffic conditions of each lane to the Raspberry Pi. The Raspberry Pi uses this information to set the signal timer according to the level of traffic

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 26

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Compo nent Name

Fig. The model of the system

5. COMPONENTS The components used in this system are listed below: A. Photoelectric sensor

It used to discover the distance, absence, or presence of an object by using a light transmitter, often infrared, and a photoelectric receiver. B. Raspberry Pi 3

Photo Electric Sensor LED 16×2 Display Driver ULN 2003 Relay

No. Of Com pone nt Use d

No of I/O pins required for each unit of compone nt.

Total no. of I/O pins requir ed

4

3

12

8 1

2 14

16 14

1

16

16

4

5

20

Table: Assembly Components

6. FLOW CHART

Raspberry pi is a miniature computer with an operating system that can be used as a development tool for different software and hardware based projects. In this project, the Raspberry Pi 3rd generation was used for its superior processing power compared to other available Microcontrollers. C. Display

This display used to show the traffic timers. D. Relay

Relay electrically operated switch. E. Driver ULN2003

The IC ULN2003A is a Darlington transistor array which deals with highvoltage and high-current. 6. ASSEMBLY The methods used to assemble all the components are discussed in this section. Table I shows the number of I/O pins used in the design and also how they are distributed among each component. It is also used to represent how the number of I/O pins was reduced to increase the efficiency of the system. ISSN:0975-887

Fig: Flowchart of the system.

7. FUTURE WORK More sensors can be used in each lane to make the system more accurate and sensitive to small changes in traffic density. Driverless cars can access the website to view the intensity of traffic at

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 27

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

an intersection and choose the fastest route accordingly. Data mining techniques such as classification can be applied on traffic data collected over a long term to study the patterns of traffic in each lane at different times of the day. Using this information, different timing algorithms can be used at different points of the day according to the traffic pattern. 8. CONCLUSION Nowadays, traffic congestion is a main problem in major cities since the traffic signal lights are programmed for particular time intervals. However, sometimes the demand for longer green light comes in at the one side of the junction due to huge traffic density. Thus, the traffic signal lights system is enhanced to generate traffic-light signals based on the traffic on roads at that particular instant. The advanced technologies and sensors have given the capability to build smart and intelligent embedded systems to solve human problems and facilitate the life style. Our system is capable of estimating traffic density using IR sensors placed on either side of the roads. Based on it, the time delay for the green light can be increased and we can reduce unnecessary

ISSN:0975-887

waiting time. The whole system is controlled by Raspberry Pi. The designed system is implemented, tested to ensure its performance and other design factors. REFERENCES [1] R. Dhakad and M. Jain, "GPS based road traffic congestion reporting system," 2014 IEEE International Conference on Computational Intelligence and Computing Research, Coimbatore, 2014, pp. 1-6.doi: 10.1109/ICCIC.2014.7238547 [2] Q. Xinyun and X. Xiao, "The design and simulation of traffic monitoring system based on RFID," The 26th Chinese Control and Decision Conference (2014 CCDC), Changsha, 2014, pp. 4319-4322. doi: 10.1109/CCDC.2014.6852939 [3] M. F. Rachmadi et al., "Adaptive traffic signal control system using camera sensor and embedded system," TENCON 2011 - 2011 IEEE Region 10 Conference, Bali,2011,pp.12611265.doi:10.1109/TENCO N.2011.6129009 [4] X. Jiang and D. H. C. Du, "BUS-VANET: A BUS Vehicular Network Integrated with Traffic Infrastructure," in IEEE Intelligent Transportation Systems Magazine, vol. 7,no. 2, pp. 47-57, Summer 2015.doi:10.1109/MITS.2015.2408137 [5] I. Septiana, Y. Setiowati and A. Fariza, "Road condition monitoring application based on social media with text mining system: Case Study:

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 28

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

THE POTHOLE DETECTION: USING A MOBILE SENSOR NETWORK FOR ROAD SURFACE MONITORING Sanket Deotarse1,Nate Pratiksha2,Shaikh Kash3, Sonnis Poonam4

1,2,3,4

Computer Engineering, Shri Chatrapati Shivaji Maharj College of Engineering, Ahmednagar, India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT Pothole Detection system is a unique concept and it is very useful to whom which face the problem of pothole in their route. The technology is purely new and idea is generated a profile for pothole in your vehicle journey. It is an application which is Accessing to timely and accurate road condition information, especially about dangerous potholes is of great importance to the public and the government. We implement an effective road surface monitoring system for automated path hole detection. It is a unique concept where it a low cost solution for the road safety purpose. This will help to avoid accidents and can use to identify problem areas early. The authorities can be alerted to take preventive actions; preventive actions can save money. Poorly maintained roads are a fact of life in most developing countries including our India. A well maintained road network is a must for the well being and the development of any country. So that we are going to create an effective road surface monitoring system. Automated path hole detection is our focus in the system. 4. The authorities can be alerted to take 1. INTRODUCTION We are going to develop a effective road preventive actions; preventive actions can surface monitoring system for automated save money. pothole detection. This is a low cost Pothole in the Dark: Perceiving Pothole solution for the road safety purpose. This Profiles with Participatory Urban will help to avoid accidents and can use to Vehicles‖, Over the past few years, there identify problem areas early. The has been a large increase in vehicle authorities can be alerted to take population. This increase in vehicle preventive actions; preventive actions can population has led to increasing road save money. Poorly maintained roads are a accidents and also traffic congestion. fact of life in most developing countries According to Global Road Safety Report, including our India. A well maintained 2015 released by the World Health road network is a must for the well being Organization (WHO), India accounts for and the development of any country. So more than 200,000 deaths because of road that we are going to create an effective accidents. These accidents can be due to road surface monitoring system. over speeding, drunk and driving, jumping Automated path hole detection is our focus traffic signals and also due to humps, in the system. This is first ever system for speed-breakers and potholes. Hence it is pothole detection. In this we are using important to collect information regarding wireless sensor network. these poor road conditions and distribute 1. We are going to develop a effective road the same to other vehicles that in turn help surface monitoring system for automated reduce accidents caused due to potholes pothole detection. and humps. Hence, in this system we have 2. This is a low cost solution for the road proposed a system that would notify the safety purpose. drivers regarding any hurdles such as 3. This will help to avoid accidents and potholes and humps and this information can use to identify problem areas early. can be used by the Government to correct these roads effectively. To develop a ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 29

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

system based on IOT to detect Potholes n the road which will be uploaded on server and notified to all the user using the application and update as per the condition. 2. MOTIVATION This research work is helpful for improving smart city application. The authorities can be alerted to take preventive actions; preventive actions can save money. 3. PROBLEM STATEMENT Before Existing system cannot give proper road condition. This technology is purely new and idea is generated a profile for pothole in your vehicle journey. It is an application which is Accessing to timely and accurate road condition information, especially about dangerous potholes is of great importance to the public and the government. 4. OBJECTIVES 1. We are going to develop a effective road surface monitoring system for automated pothole detection. 2. This is a low cost solution for the road safety purpose. 3. This will help to avoid accidents and can use to identify problem areas early. 4. The authorities can be alerted to take preventive actions; preventive actions can save Money 5. Notification to Users. 6. Updating a per the latest road condition

5. PROPOSED SYSTEM The proposed system consists of entities such as ultrasonic sensor and micro controller for pothole detection. We are going to develop an effective road surface monitoring system for automated path hole detection. This is a low cost solution for the road safety purpose.

ISSN:0975-887

Fig 1. Project Idea The system it automatically detects the potholes and humps and sends the information regarding this to the vehicle drivers, so that they can avoid accidents. This is a cost efficient solution for detection of humps and potholes. This system is effective even in rainy season when roads are loaded with rain water as well as in winter during low visibility, as the alerts are sent from the stored information in the server/database. This system helps us to avoid dreadful potholes and humps and hence to avoid any tragic accidents due to bad road conditions.

Fig 2. System Architecture

6. IMPLEMENTATION MODULE 6.1 Mobile Application Module: User can collect the pathole notification from the system for his safe journey. 6.2 Server Module:

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 30

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

The server module is nothing but the database for system. It is an Intermediate layer between sensing and mobile application module. Its function is to store the updated information received by the sensor and provide to the requested user whenever needed. This module can also be updated frequently for information related to the potholes and humps. 6.3 Microcontroller Module: The Module is responsible for coordinating the hardware and server. 6.4 Sensing Module: This model consists GPS receiver, ultrasonic sensor (HCSR04) and GSM SIM 900 modem. The distances in between the car body and the road surface area is calculated with the help of an ultrasonic sensor. A threshold value is set such that the value based on ground clearance of the transport vehicle. The calculated distance(depth parameter) is compared with the threshold value to detect pothole or hump. If the calculated distance is greater when compared with the threshold value, then it is classified to be a pothole, and if the measured distance is less, then it is classified to be a hump. The location co-ordinates fetch by the GPS receiver, along with this data the information regarding the detected pothole or hump at a particular location coordinate is broadcast to the server using a GSM modem.

Sensorreadingarray [ ] //depth parameter for (k=0 ; k isgreater noofsensor ;k++) x=Sensorreadingarray[k]; //values will be check y=Sensorreadingarray[k+1]; // through threshold if(abs(x-y) isgreater patholethreshold) //make sure hardware if function is not malfunction pathole ag = true; timestamp =currenttime;

7. METHODOLOGY We implement this system for avoiding the obstacle in our route for safe journey and maintain a vehicle proper condition. In this paper we use the following algorithm for implementation the detection system Algorithm details: Input: Sensor Value Output: According to the system the of output is positive that is one when the proposed pothole detection system face the pathole in car journey. Following code shows, how operations performed within the system and the sequence in which they are performed.

9. ACKNOWLEDGMENT We express our sincere thanks to our project guide Prof. Lagad J. U. who always being with presence & constant, constructive criticism to made this paper. We would also like to thank all the staff of COMPUTER DEPARTMENT for their valuable guidance, suggestion and support through the project work, who has given co-operation for the project with personal attention. Above all we express our deepest gratitude to all of them for their kindhearted support which helped us a lot during project work. At the last we thankful to our friends, colleagues for the

ISSN:0975-887

8. CONCLUSION AND FUTURE SCOPE In this paper, we have proposed a system which will detect the potholes on the road and save the information in the server and reduce the vehicle speed if needed. Due to the rains and oil spills potholes are generated which will cause the accidents. The potholes are detected and its height, depth and size are measured using ultrasonic sensor. The GPS is used to find the location of pothole. All the information is saved in the database. This timely information can help to recover the road as fast as possible. By controlling the rate of fuel injection we can control the rotation of the drive shaft by means of an IR Non-contact tachometer. This helps to reduce the vehicle speed when pothole or hump is detected. Hence the system will help to avoid road accidents.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 31

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

inspirational help provided to us through a project work. REFERENCES [1] S. S. Rode, S. Vijay, P. Goyal, P. Kulkarni, and K. Arya, detection and warning system: Infrastructure support and system design,‖ in Proc. Int. Conf Electron. Comput. Technol., Feb. 2009, pp. 286290. [2] R. Sundar, S. Hebbar, and V. Golla, intelligent trac control system for congestion control, ambulance clearance, and stolen vehicle detection,‖ IEEE Sensors J., vol. 15, no. 2, pp. 11091113, Feb. 2015.

ISSN:0975-887

[3] Samyak Kathane, Vaibhav Kambli, Tanil Patel and Rohan Kapadia, Time Potholes Detection and Vehicle Accident Detection and Reporting System and Anti-theft (Wireless)‖, IJETT, Vol. 21, No. 4, March 2015. [4] J. Lin and Y. Liu, \Potholes detection based on SVM in the pavement distress image," in Proc. 9th Int. Symp. Distrib. Comput. Appl. Bus. Eng. Sci., Aug. 2010, pp. 544{ 547 [5] I. Moazzam, K. Kamal, S. Mathavan, S. Usman, and M. Rahman, \Metrology and visualization of potholes using the microsoft Kinect sensor," in Proc. 16th Int. IEEE Conf. Intell. Transp. Syst., Oct. 2013, pp. 1284{1291.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 32

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

IOT BASED AGRICULTURAL SOIL PREDICTION FOR CROPS WITH PRECAUTIONS Prof.Yashanjali Sisodia1, Pooja Gahile2, Chaitali Meher3

1,2,3

Department of Computer Engineering, GHRCOEM, Ahmadnagar, India. [email protected], [email protected],[email protected]

ABSTRACT The present study focuses on the applications of data mining techniques in yield prediction in the face of climatic change to help the farmer in taking decision for farming and achieving the expected economic re- turn. The problem of yield prediction is a major problem that can be solved based on available data. Data mining techniques are the better choices for this purpose. Dif- ferent Data Mining techniques are used and evaluated in agriculture for estimating the future year‘s crop production. Therefore we propose a brief analysis of crop yield prediction using k Nearest Neighbor (kNN) technique and Density based clustering technique for the selected region i.e. Pune district of Maharashtra in India. General Terms In this work the experiments are performed two important and well known classification algorithms K Nearest Neighbor (kNN) and Density based clustering are applied to the dataset. Keywords Data Mining,Machine Learning,Classification Rule,K Nearest Neighbor(KNN),Density Based Clustering. Prasad.c. Ascough, ―PhenologyMMS: A 1. INTRODUCTION The study focuses on the applications of program to simulate crop phonological data mining techniques in yield prediction responses to water stress ‖Journal in the face of climatic change to help the Computers and Electronics in Agriculture farmer in taking decision for farming and 77 (2011) 118-125 Crop phenology is achieving the expected economic return. fundamental for understanding crop The problem of yield prediction is a major growth and development, and increasingly problem that can be solved based on past influences many agricultural management data.Therefore we propose a brief analysis practices. Water deficits are one of crop yield prediction using K Nearest environmental factor that can influence Neighbor (kNN) technique for the selected crop phenology through shortening or region in India The patterns of crop lengthening the developmental phase, yet production in response to the climatic the phonological responses to water (rainfall, temperature, relative humidity, deficits have rarely been quantified. The evaporation and sunshine) effect across the objective of this paper is to provide an selected regions of Maharashtra are being overview of a decision support technology developed using K Nearest Neighbor software tool, Phenology MMS Vl.2, (kNN) technique. developed to simulate the phenology of Will be beneficial if farmers could use the various crops for varying levels of soil technique to predict the future crop water. The program is intended to be productivity and consequently adopt simple to use, requires minimal alternative adaptive measures to maximize information for calibration, and can be yield. incorporated into other crop simulation models. It consists of a Java interface connected to FORTRAN science modules 2. LITERATURE REVIEW Gregory S. McMaster, DA Edmunds, to simulate phonological responses. The W.W. Wilhelm ,l, D.C. Nielsen, P.v.v. complete developmental sequence of the ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 33

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

shoot apex correlated with phonological events, and the response to soil water availability for winter and spring wheat (Triticum aestivum L.), winter and spring barley (Hordeum vulgare L.), corn (Zea mays L.), sorghum (Sorghum bicolor L.), proso millet (Panicum milaceum L.), hay/foxtail millet [Setaria italica (L.) P. Beauv.]. And sunflower (Helianthus annus L.) was created based on experimental data and the literature. Model evaluation consisted of testing algorithms using ―generic‖ default phenology parameters for wheat (i.e., no calibration for specific cultivars was used) for a variety of field experiments to predict developmental events. Results demonstrated that the

program has general applicability for predicting crop phenology and can aid in crop management. 3. SYSTEM ARCHITECTURE The coaching of soil is step one earlier than developing a crop.one of themost vital task in agricultural is to penetrate deep into soil and unfasten it.the unfastened soil allows the roots to breathe effortlessly even if they move deep into soil 1.1.

Title and Author

IOT Based Agricultural Soil Prediction for Crops With Precautions.

Fig: Prediction is a major hassle that can be solved.

3. SYSTEM ANALYSIS To Design and develop records era in addition to in agriculture era. Agrarian area in India is dealing with rigorous trouble to maximize the crop productiveness. the prevailing take a look at makes a specialty of the applications of information mining strategies in yield prediction in the face of climatic exchange to assist the farmer in taking choice for farming and achieving the expected economic go back. The problem of yield as well as disease based on available statistics. Subsequently we proposed a device Prediction of Crop disease Prediction as according to climate situation. ISSN:0975-887

4. ACKNOWLEDGMENTS I would prefer to give thanks the researchers likewise publishers for creating their resources available. I‘m conjointly grateful to guide, reviewer for their valuable suggestions and also thank the college authorities for providing the required infrastructure and support. 5. RESULTS Agriculture is the spine of Indian economic system. In India, majority of the farmers are not getting the expected crop disease after which yield due to numerous reasons. The agricultural yield is basically relies upon on weather situations. Rainfall situations additionally

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 34

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

influence the rice cultivation. on this context, the farmers necessarily requires a well-timed advice to predict the future crop productivity, disorder and an analysis is to be made as a way to assist the farmers to maximise the crop manufacturing in their vegetation. REFERENCES [1] Adams, R., Fleming, R., Chang, C., McCarl, B., and Rosenzweig, 1993 ―A Reassessment of the Economic Effects of Global Climate Change on U.S. Agriculture, Unpublished: September. [2] Adams, R.,Glyer, D., and McCarl, B. 1989. "The Economic Effects of Climate Change on U. S. Agriculture: A Preliminary Assessment." In Smith, J., and Tirpak, D.,eds., The Potential Effects of Global Climate Change onthe United States. Washington, D.C.: USEPA. [3] Adams, R.,Rosenzweig, C., Peart, R., Ritchie, J., McCarl,B., Glyer, D., Curry, B., Jones, J., Boote, K., and Allen, H.1990."Global Climate

ISSN:0975-887

Change and U. S. Agriculture."Nature.345 (6272, May): 219-224. [4] Adaptation to Climate Change Issues of Longrun Sustainability." An Economic Research [5] Barron, E. J. 1995."Advances in Predicting Global Warming‖.The Bridge (National Academy of Engineering). 25 (2,Summer): 10-15. [6] Barua, D. N. 2008. Science and Practice in Tea Culture,second ed. Tea Research Association, Calcutta-Jorhat,India. [7] D Ramesh , B Vishnu Vardhan, ―Data mining technique and applications to agriculture yield data‖, International Journal of Advanced Research in Computer and Communication Engineering Vol. 2, Issue 9, September 2013 . [8] Gideon O Adeoye, Akinola A Agboola, ―Critical levels for soil pH, available P, K, Zn and Mn and maize ear-leaf content of P, Cu and Mn insedimentary soils of SouthWestern Nigeria‖, Nutrient Cycling in Agroeco systems, Volume 6, Issue 1, pp 65-71, February 1985.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 35

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

IoMT HEALTHCARE: SECURITY MEASURES Ms. Swati Subhash Nikam1, Ms. Ranjita Balu Pandhare2 1 2

Department of Computer Engineering, JSPM‘s RSCOE Thatwade, Pune, India. Department of Computer Science & Engineering, KIT‘s College of Engineering Kolhapur, India. [email protected], [email protected]

ABSTRACT IoT the vast network of connected things and people, enable users to collect and analyze data through the use of connected devices. In Healthcare, prevention and cure have seen diverse advancement in technological schema. Medical equipment used in this advanced technology also see internet integration. Such equipment used with internet of things are termed as Internet of Medical things (IOMT). IoMT is transforming healthcare industry by providing large scale connectivity for medical devices, patients, physicians, clinical and nursing staff who use them and facilitate real-time monitoring based on the information gathered from the connected things. Security constraints for IoMT take confidentiality, integrity and authentication as prime key aspect. These have to be obtained in each through integration of physical devices such as sensors for connectivity and communication in cloud-based facility which in course is delivered by user interface. Security strategy of access control and data protection for these have to be obtained at various layers in IoMT architecture. Access Control security is obtained by key generation for data owners and data user of personal health records while data protection security is obtained by use of advanced encryption algorithm (AES). General Terms IoT, Security, Algorithm,Healthcare Keywords IoT, Healthcare, IoMT, Security, Cloud-based, Personal Health Records, Privacy, Access Control. online computer networks. As the amount 1. INTRODUCTION During recent times Internet has penetrated of connected medical devices increases, in our everyday life. Many things have the power of IoMT grows as well grows revolutionized the way we manage our the scope of its application, be it remote lives. Internet of things (IoT) tops this list. patient monitoring of people with chronic IoT the vast network of connected things or long-term conditions or tracking patient and people, enable users to collect and medication orders or patients‘ wearable analyze data through the use of connected Health devices, which can send devices. In Healthcare, prevention and information to caregivers. This new cure have seen diverse advancement in practice to use IoMT devices to remotely technological schema. Chronic care and monitor patients in their homes spares prevention care both stand on equal level them from traveling to a hospital, with the same advancement in technology. whenever they have a medical question or Medical equipment used in this advanced change in their condition. This has technology also see internet integration. revolutionized the whole healthcare Such equipment used with internet of ecosystem and doctor-patient things are termed as Internet of Medical communication settings. things (IOMT). The Internet of Medical Basic record of medical health of patient is Things (IoMT) is virtually the collection stored as Personal Health Records (PHR). of medical devices and applications that Numerous methods have been employed to connect to healthcare IT systems through ensure the privacy of the PHRs stored on ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 36

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

the cloud servers. The privacy preserving approaches make sure confidentiality, integrity, authenticity, accountability, and audit trial. Confidentiality ensures that the health information is entirely concealed to the unsanctioned parties, whereas integrity deals with maintaining the originality of the data, whether in transit or in cloud storage. Authenticity guarantees that the health-data is accessed by authorized entities only, whereas accountability refers to the fact that the data access policies must comply with the agreed upon procedures. 2. RELATED WORK A Review on the State-of- the-Art Privacy Preserving Approaches in the eHealth Clouds [16] This paper aimed to encompass the stateof-the-art privacy preserving approaches employed in the e-Health clouds. Automated PHRs are exposed to possible abuse and require security measures based on the identity management, access control, policy integration, and compliance management. The privacy preserving approaches are classified into cryptographic and non-cryptographic approaches and taxonomy of the approaches is also presented. Furthermore, the strengths and weaknesses of the presented approaches are reported and some open issues are highlighted. The cryptographic approaches to reduces the privacy risks by utilization of certain encryption schemes and cryptographic primitives. This includes Public key encryption, Symmetric key encryption, Alternative primitives such as Attribute based encryption, Identity based encryption, proxy-re encryption A General Framework for Secure Sharing of Personal Health Records in Cloud System [17] In this paper, Author provided an affirmative answer to problem of sharing by presenting a general framework for secure sharing of PHRs. This system ISSN:0975-887

enables patients to securely store and share their PHR in the cloud server (for example, to their care-givers), and furthermore the treating doctors can refer the patients‘ medical record to specialists for research purposes, whenever they are required, while ensuring that the patients‘ information remain private. This system also supports cross domain operations (e.g., with different countries regulations). Electronic Personal Health Record Systems: A Brief Review of Privacy, Security, and Architectural Issues [18] This paper addressed design and architectural issues of PHR systems, and focused on privacy and security issues which must be addressed carefully if PHRs are to become generally acceptable to consumers. In conclusion, the general indications are that there are significant benefits to PHR use, although there are architecturally specific risks to their adoption that must be considered. Some of these relate directly to consumer concerns about security and privacy, and Authors have attempted to discuss these in the context of several different PHR system architectures that have been proposed or are in trial. In Germany, the choice of the standalone smartcard PHR is close to national implementation. In the United States, implementations and/or tests of all the suggested architectures except the standalone smartcard are underway. In the United Kingdom, the National Health Service (NHS) appears to have settled on an integrated architecture for PHRs. Achieving Secure, Scalable and Finegrained Data Access Control in Cloud Computing [19] This paper addressed challenging open issue by, on one hand, defining and enforcing access policies based on data attributes, and, on the other hand, allowing the data owner to delegate most of the computation tasks involved in fine grained data access control to untrusted cloud servers without disclosing the underlying

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 37

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

data contents. It achieved this goal by exploiting and uniquely combining techniques of attribute-based encryption (ABE), proxy re-encryption, and lazy reencryption. This scheme also has salient properties of user access privilege confidentiality and user secret key accountability. Extensive analysis shows that this scheme is highly efficient and provably secure under existing security models. 3. PHASES IN IOMT Phase I: Data Collection, Data Acquisition Physical devices such as sensors plays important role in enhancing safety and improving the Quality of life in healthcare arena. They have inherent accuracy, intelligence, capability, reliability, small size and low power consumption.

Analyzing and responding to queries, the IoT also controls things. Intelligent processing involves making data useful through machine learning algorithms. Phase IV: Data Transmission Data Transmission occurs through all parts, from cloud to user. The user may be doctor, nurse, pharma and patient himself. Phase V: Data Delivery Delivery of information takes place through user interface which may be mobile, desktop or tablet. Delivered data is in respect to role of person who is asking data. Doctor related data and pharma related data will be different.

Figure 2: IoMT Architecture

Figure 1: Phases in IOMT [4]

Phase II: Storage The data collected in phase I should be stored. Generally, IoT components are installed with low memory and low processing capabilities. The cloud is the best solution that takes over the responsibility for storing the data in the case of stateless devices. Phase III: Intelligent Processing The IoT analyzes the data stored in the cloud DCs and provides intelligent services for work and life in hard real time. ISSN:0975-887

4. ATTACKS ON PHASES Phase I: Data Loss Data loss refers to losing the work accidentally due to hardware or software failure and natural disasters. Data can be duplicated by intruders. It must be ensured that perceived data are received from intended sensors only. Data authentication could provide integrity and originality. Phase II: Denial of service, Access Control The main objective of DOS attack is to overload the target machine with many service requests to prevent it from responding to legitimate requests. Unable to handle all the service requests on its own, it delegates the work load to other similar service instances which ultimately

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 38

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

leads to flooding attacks. Cloud system is more vulnerable to DOS attacks, since it supports resource pooling. [7]

Figure 3: Attacks on Phases [4]

Phase III: Authentication ‗Proof of Identity‘ is compromised. Password is discovered. Attackers adopt several mechanisms to retrieve passwords stored or transmitted by a computer system to launch this attack. Guessing Attack: But in online guessing scenario the system blocks the user after a certain number of login attempts. Brute Force Attack: This attack is launched by guessing passwords containing all possible combinations of letters, numbers and alphanumeric characters until the attacker get the correct password [7]. Phase IV: Flooding The cloud server before providing the requested service, checks for the authenticity of the requested jobs and the process consumes CPU utilization, memory etc. Processing of these bogus requests, make legitimate service requests to starve, and as a result the server will offload its jobs to another server, which will also eventually arrive at the same situation. The adversary is thus successful in engaging the whole cloud system, by attacking one server and propagating the attack further by flooding the entire system. Phase V: Hacker A hacker is someone who seeks to breach defenses and exploit weaknesses in network. Hackers may be motivated by a multitude of reasons, such as profit, ISSN:0975-887

protest, information gathering, challenge, recreation, or to evaluate system weaknesses to assist in formulating defenses against potential hackers. 5. SECURITY MEASURES IN IOMT Sensor Node: Security is essential in sensor nodes which acquire and transmit sensitive data. The constraints of processing, memory and power consumption are very high in these nodes. Cryptographic algorithms based on symmetric key are very suitable for them. The drawback is that secure storage of secret keys is required. In this work, a low-cost solution is presented to obfuscate secret keys with Physically Unclonable Functions (PUFs), which exploit the hardware identity of the node. [5] Access Control: In Context Based Access Control, Context is a multi-dimensional information structure, where each dimension is associated with a unique type (value domain) often want to know ―who is wearing what device, when, where, and for what purpose‖. We refer to who etc.; as dimensions. The value associated with a dimension is of a specific type. As an example, with who we can associate a ―role‖, and with where we can associate a location name. A collection of such (dimension, value) pairs is a context. [14].

Figure 4: Security Measures [4]

Encryption to ensure Data Integrity: Attributed based encryption (ABE), provides a mechanism by which we can

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 39

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

ensure that even if the storage is compromised, the loss of information will only be minimal. What attribute-based encryption does is that, it effectively binds the access-control policy to the data and the users(clients) instead of having a server mediating access to files. [6] Securing Network Architecture: The IETF has proposed a paradigm known as Representational State Transfer (ReST) which is used for extending the web services over IoT. There is a similarity between conventional web services and IoT services following the ReST paradigm which helps the developers and users to use the traditional web knowledge in IoT web-based services. [8] Event Logging & Activity Monitoring Event logging and Activity monitoring, process is performed by examining electronic audit logs for indications that unauthorized security-related activities have been attempted or performed on a system or application that processes, transmits or stores confidential information. When properly designed and implemented, system event logging and monitoring assists organizations to determine what has been recorded on their systems for follow-up investigation and if necessary, remediation. [9] Mathematical Model System Description: Let S be the whole System, S= I, P, O I= Input, P=Procedure, O= Output Users u=owner, doctor, health care staff u= u1, u2... un Keywords k= k1, k2...kn H= heart sensor T= temperature sensor D=details EHR=Electronic Health Record Trapdoor generation t= t1, t2, tn I = I0, I1, I2, I3 I0 = H, T, D I1= u I2= k ISSN:0975-887

I3 = EHR P = P0, P1, P2, P3, P4, P5 P0 = EHR encrypted (AES algorithm used) P1 = k ,P2 = t P3 = key generate P4 = sell secrete key P5 = KGC O = O0, O1, O2 O0 = EHR decrypted O1= User revocation ,O2= Traitors identify

Fig. 5 Mapping Diagram

6. CONCLUSION Proposed measures, safely stores and transmits PHRs to the authorized elements in the cloud. The strategy preserves the security of the PHRs and authorizes a patient-driven access control to various segments of the PHRs on the access provided by the patients. We executed a context access control technique so that even the valid system clients can‘t get to those segments of the PHR for which they are not authorized. The PHR owners store the encrypted information on the cloud and just the approved users having valid reencryption keys issued by a semi-trusted authority can decrypt the PHRs. The job of the semi-trusted authority is to produce and store the public/private key sets for the clients in the system. The performance Evaluation is done on the based-on time required to generate keys, encryption and decryption tasks, and turnaround time. REFERENCES [1] Bhosale

A.H., Manjrekar A.A. (2019) Attribute Based Storage Mechanism with De-duplication Filter: A Review Paper. In: Fong S., Akashe S., Mahalle P. (eds) Information and Communication

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 40

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9] [10]

Technology for Competitive Strategies. Lecture Notes in Networks and Systems, vol 40. Springer, Singapore Jin-cui YANG, Bin-xing FANG,Security model and key technologies for the Internet of things, The Journal of China Universities of Posts and Telecommunications ,Volume 18, Supplement 2,2011,Pages 109-112, ISSN 1005-8885,https://doi.org/10.1016/S10058885(10)60159-8 Lo-Yao Yeh, Woei-Jiunn Tsaur, and Hsin-Han Huang. 2017. Secure IoT-Based, IncentiveAware Emergency Personnel Dispatching Scheme with Weighted Fine-Grained Access Control. ACM Trans. Intell. Syst. Technol. 9, 1, Article 10 (September 2017), 23 pages. DOI: https://doi.org/10.1145/3063716 Fei Hu, Security and Privacy in Internet of Things (IoT). Models Algorithms and Implementarions, CRC Press, 2016. Arjona, R.; Prada-Delgado, M.Á.; Arcenegui, J.; Baturone, I. A PUF- and Biometric-Based Lightweight Hardware Solution to Increase Security at Sensor Nodes. Sensors 2018, 18, 2429. S. Venugopalan,‖ Attribute Based Cryptology,‖ PhD Dissertation Indian Institute Of Technology Madras, April-2011. Sumitra B, Pethuru CR & Misbahuddin M, ―A survey of cloud authentication attacks and solution approaches‖, International journal of innovative research in computer and communication engineering, Vol.2, No.10, (2014), pp.6245-6253. Sankar Mukherjee, G.P. Biswas,Networking for IoT and applications using existing communication technology,Egyptian Informatics Journal,Volume 19,Issue 2,2018,Pages 107-127,ISSN 11108665,https://doi.org/10.1016/j.eij.2017.11.002. https://www.controlcase.com/services/logmonitoring/ Babar, Sachin & Stango, Antonietta & Prasad, Neeli & Sen, Jaydip & Prasad, Ramjee. (2011). Proposed Embedded Security Framework for Internet of Things (IoT). 10.1109/WIRELESSVITAE.2011.5940923.

ISSN:0975-887

[11] Weber, Rolf. (2010). Internet of Things – New

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

security and privacy challenges. Computer Law & Security Review. 26. 23-30. 10.1016/j.clsr.2009.11.008. K. Zhao and L. Ge, "A Survey on the Internet of Things Security," 2013 Ninth International Conference on Computational Intelligence and Security(CIS), Emeishan 614201, China China, 2013, pp. 663-667. doi:10.1109/CIS.2013.145. Security Issues and Challenges for the IoTbased Smart Grid Procedia Computer Science, ISSN: 1877-0509, Vol: 34, Page: 532-537 V. Alagar, A. Alsaig, O. Ormandjiva and K. Wan, "Context-Based Security and Privacy for Healthcare IoT," 2018 IEEE International Conference on Smart Internet of Things (SmartIoT), Xi'an, 2018, pp. 122-128. doi: 10.1109/SmartIoT.2018.00-14 Arbia Riahi Sfar, Enrico Natalizio, Yacine Challal, Zied Chtourou,A roadmap for security challenges in the Internet of Things,Digital Communications and Networks,Volume 4, Issue 2,2018,Pages 118-137,ISSN 23528648,https://doi.org/10.1016/j.dcan.2017.04.00 3. A. Abbas and S. U. Khan, "A Review on the State-of-the-Art Privacy-Preserving Approaches in the e-Health Clouds," in IEEE Journal of Biomedical and Health Informatics, vol. 18, no. 4, pp. 1431-1441, July 2014. doi: 10.1109/JBHI.2014.2300846. M. H. Au, T. H. Yuen, J. K. Liu, W. Susilo, X. Huang, Y. Xiang, and Z. L. Jiang, ―A general framework for secure sharing of personal health records in cloud system‖, Journal of Computer and System Sciences, 2017. David Daglish and Norm Archer, ―Electronic Personal Health Record Systems: A Brief Review of Privacy, Security, and Architectural Issues‖, IEEE 2009. S. Yu, C. Wang, K. Ren and W. Lou, "Achieving Secure, Scalable, and Fine-grained Data Access Control in Cloud Computing," 2010 Proceedings IEEE INFOCOM, San Diego, CA, 2010, pp. 1-9. doi: 10.1109/INFCOM.2010.5462174.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 41

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SMART WEARABLE GADGET FOR INDUSTRIAL SAFETY Ketki Apte1, Rani Khandagle2, Rijwana Shaikh3,Rani Ohal4 1,2,3,4

Department of Computer Enineering, SCSMCOE, Nepti, Ahmadnagar, India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT To build a smart device which assist factory workers and other employees, IoT hardware and protocols are described in this paper. It is a wearable Glove device which is used in different workspaces where power tools are being constantly used. This apparatus is made around a microprocessor acting as a central sever, while other sensor are interfaced with microcontrollers, and it act as a link for data transformation and perform different topics. A microcontroller‘s works as the master and it controls the others microcontrollers attached to different sensors. In this master there is a LCD screen and few buttons, and control the other sensors and read the data in real time. There are safety features in this glove thus workers cannot use any dangerous power tools without wearing proper equipment. This glove works as a security measure in such a way that each tool will have restricted access according to the level of expertise of the worker. This glove is able to restrict the access to the tools, which are being used actively during a particular time frame. The central server and different other sensors such a heat sensor, temperature sensor and vibration sensor log the entire data which can be attached and monitored by the master glove. Whenever the user gets hurt and shouts in pain, the analysis function classify the pain and call the medical help because this system has an extra capability of analysing tone of workers. A sweep based camera module is used along the central server to record and live stream the captured video when power tool is switched on. This system focuses the importance of a worker‘s safety in factory floor. Keywords Internet of Things, Industry 4.0, MQTT, Node,Wireless Communications,Factory. by the Internet-of-Thing. In that system 1. INTRODUCTION With the start of industrial revolution, different sensors like temperature sensor, power tools became very important part of ambient, accelerometer can be used to the factory floor. Every day, millions of capturing the data. people go to work and operate potentially life threatening machines. According to 2. RELATED WORK publicly available statistics, more than a Multiple hardware solutions exit to protect hundred thousand people are injured in and increase the level of safety in any power tool related accidents every year. power tool or machinery. A set of safety This results in a huge loss of precious and hazard rules are placed in workspace work-force and other resources. The idea to limit such issues. But the current of Connected Machines is an appealing technology only aims at securing the one and it can be applied to the large as machines and devices, but does not factor well as small scale machinery to improve in on human errors which is one of the the efficiency and thus, the productivity in major issues in this case. The tools are not factories. It is believed that both the access locked and any user, irrespective of aforementioned ideas can go hand in hand skill set, can use them. If proper protective and that we can create a solution that measures are not taken seriously, they can would help with the safety in factories and lead to serious issue [1]. improve efficiency that would be provided ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 42

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. PROPOSED WORK The proposed solution is an IoT based system that implements a wearable that connects to any type of machinery and permits access based on whether proper safety equipment has been warn. We will use sensor on these equipment and send this data to a Arduino. On the Arduino, we will check if the machine, for which access is begin requested for, is free to be used and if all the proper gear is begin used by person requesting access and based on this information, the Ardunio will control a relay that will power the machine[1] 4. SYSTEM ARCHITECTURE Data Glove is further divided into the two parts: 1. Transmitter 2. Receiver Transmitter This is most important section in Smart glove. This sensor simply deliver the data content from the main server system to the device and from device to system. Transmission throughout the system is performed by this device. Smart gloves use various devices to perform this operation. Temperature sensor measures -550C to 1500C. For light detection purpose LDR Sensor are used. Using 3-AXIS Accelometer gesture values are represented in form of X, Y and Z coordinates. Arduido Nano converts binary values into digital values using ADC converter, these values are processed and sent to receiver side via nRF24Lo1 transceiver. nRF24L01 performs operations of transferring and receiving in combine.. Receiver For all operations Aurdino UNO is important unit. Using nRF24L01 value is

ISSN:0975-887

fetched from the Glove and transferred to ARDUINO IDE. Hardware 1. Arduino Nano: Based on Atmega328/168.Power supply to arduino is given through Mini-B USB connection, 5V regulated external power supply is given. Arduino Nano has 32kb and Digital Pins are 14. 2. Arduino UNO: It has 14 digital I/O PINS, USB connection, power port and ICSP header. It supports plug and play via USB port.Sensor values are transferred using Arduino Nano to Arduino IDE. 3. Temperature Sensor: It is basically used for measuring temperature fluctuations among temperature values around the sensor. LM35 is preferred sensor. It measures temperature ranging from -55 to +150 degree in Celsius. 4. LDR Sensor (Light Dependent Register): LDR works as: This is light dependent sensor. When light falls on LDR then the resistance decreases and thus conductivity increases. 5. 3-Axis Accelerometer (ADXL335): It is a low power, sensor. It measures accelerations of range ±3g. It detects the vibrations of Machinery. It measures the static and dynamic acceleration. 6. nRF24L01: It is low power transceiver that operates on frequency of 2.4 Ghz. It is mainly used for wireless communication. It is a preferred Transceiver. Software Arduino IDE: It is used for embedded application development which executes on Windows, Linux, Mac, etc. and supports embedded C, C++.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 43

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Fig 1: Architecture diagram

5. CONCLUSION This system is IoT based .Wearable glove is ready with different sensors as temperature, LDR and 3-AXIS Accelerometer, Arduino IDE, Arduino Nano, Arduino Uno. In this system 5V Battery is used. In small scale industry, smart glove that connect to any type of machinery and provides access to it based on whether Proper safety of that machine is ensured. 6. ACKNOWLEDGEMENT We take this opportunity to express my hearty thanks to all those who helped me in the completion of the Project on Smart Wearable Gadget for Industrial Safety. We would especially like to express my sincere gratitude to Prof. Guide Name, Our Guide Prof. Pauras Bangar and HOD Prof. J.U. Lagad HOD Department of Computer Engineering who extended their moral support, inspiring guidance and encouraging independence throughout this task. We would also thank our Principal Sir, DR. R.S Deshpande for his great insight and motivation. Last but not least, we would like to thank my fellow colleagues for their valuable suggestions. ISSN:0975-887

REFERENCES [1] Chirag Mahaveer Parmar,Projjal Gupta,K Shashank Bhardwaj 2018 (Members,IEEE)‖,Smart Work –Assisting Gear‖. Next Tech Lab(IoT Division)SRMUniversity,Kattankulalthur 2018. [2] Aditya C, Siddharth T, Karan K, Priya G 2017, Meri Awaz-Smart Glow Learning Assistant for Mute Students and Teachers. IJIRCCE 2017. [3] Umesh V. Nikam, Harshal D.Misalkar, Anup W. Burange 2018,Securing MQTT Protocol in IoT by payload Encryption Technique and Digital Signature, IAESTD 2018. [4] Dhawal L. Patel,Harshal S. Tapse, Praful A. Landge, Parmeshwar P. More and Prof.A.P. Bagade 2018, Smart Hand Gloves for Disable Peoples, IRJET 2018. [5] Suman Thakur, Mr. Manish Varma, Mr. Lumesh Sahu 2018,Security System Using Aurdino Microcontroller, IIJET 2018. [6] Radhika Munoli, Prof. Sankar Dasiga 2016,Secured Data Transmission Fro IoT Application, IJARCCE 2016. [7] Ashton K. That 2009 ‗‗Internet of Things‘‘ thing. RFiD Journal; 2009. [8] Vincent C. Conzola, Michael S. Wogalter 1998, Consumer Product Warnings: Effects of injury Statistics on the call and Subjective Evaluation, HFAES 1998.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 44

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SMART SOLAR REMOTE MONITORING AND FORECASTING SYSTEM

Niranjan Kale1, Akshay Bondarde2, Nitin Kale3, Shailesh Kore4, Prof.D.H.Kulkarni5 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected], [email protected], nitin kale [email protected], [email protected], [email protected]

ABSTRACT In innovative developing technologies IoT leads the work quicker and cleverer to appliance. Every and each solar photovoltaic cell of a solar board necessities to observer to know its present rank as for this is concern observing in addition to sensing just in case of deficiency in solar cells of a panel and appliance curative measures to work in a good situation. The Internet of Things has a prophecy in which the internet spreads into the actual world implementation everyday objects. The IoT permits objects to be detected and/or precise remotely over prevailing network structure, generating chances for pure amalgamation of the corporal world into computer-based systems, and resultant in developed efficacy, correctness and economic advantage in adding to reduced human interference. This equipment has several applications like Solar cities, Smart villages, Micro grids and Solar Path lights and so on. As Renewable energy raised at a ratiomore rapidly than whichever other time in history through this period. The suggested structure denotes to the online display of the power usage of solar energy as a renewable energy. This monitoring is completedconcluded raspberry pi with flask framework. Smart Monitoring displays everyday procedure of renewable energy. This helps the user to scrutiny of energy usage. Analysis things on the renewable energy usage and electricity issues. The suggested work implements estimating approaches for equally solar resource and PV power. System used strengthening learning methodology for prospect forecast of power generation. We also forecast the mistake finding as well dead state of panel. In the investigational investigation we matchthe concrete expectation and energy generation of panel with time parameters. General Terms Internet of Things, Solar Cell, raspberry pi,Renewable Energy , Machine Learning. Keywords Solar Power,Battery,Sensors,Remote Monitoring,Raspberry pi, Mycielski-Markov Model. 1. INTRODUCTION 2. MOTIVATION Renewable energy sources, such as solar Today‘s solar plants are highly and wind, offer many environmental unstructured and localized.Need to map advantages over fossil fuels for electricity the prediction scenario accuracy of generation, but the energy produced by system.Study and analysis how them fluctuates with changing weather environmental factors can affect on conditions. This work we proposed a solar technical predictions.The Photovoltaic energy generation and analysis with plantsgenerate energy but we can not prediction in IoT Environment.We also monitor the performance of each Solar proposed energy predictions scenario base panel. on some data mining and prediction techniques.System can provide the 3. LITERATURE SURVEY capacity of energy productivity of PV FatihOnurHocaoglu and FatihSerttas [1] panel suggested a system A novel hybrid (Mycielski-Markov) model for hourly ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 45

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

solar radiation forecasting. System focuses on short term predictions of solar radioactivity are revised. An alternate method and model is suggested. The method accepts that solar radiation data recurrences itself in the history. Allowing to this preliminary supposition, a novel Mycielski constructed model is planned. This model reflects the recorded hourly solar radiation statistics as an array and starting from the last record value, it goes to discovery most parallel sub-array pattern in the history. This sub-array pattern agrees to the longest matching solar radiation data array in the history. The data detected after this lengthiest array in history is measured as the forecast. In case numerous sub-arrays are obtained, the model selects the choice rendering to the probabilistic relatives of the sub-patterns last values to the following value. To model the probabilistic relations of the data, a Markov chain model is approved and used. By this way historical search model is fortified. According to Yu Jiang [2] projected Dayahead Forecast of Bi-hourly Solar Radiance with a Markov Switch Approach, system uses a regime switching procedure to designate the progress of the solar radiance time-series. The optimal number of regimes and regime-exact parameters are unwavering by the Bayesian implication. The Markov regime switching model offers together the point and intermission forecast of solar viva city centered on the posterior distribution consequent from historical data by the Bayesian implication. Four solar viva city predicting models, the perseverance model, the autoregressive (AR) model, the Gaussian process regression (GPR) model, as well as the neural network 1. model, are measured as starting point models for authenticating the Markov switching model. The reasonable analysis based on numerical experiment outcomes determines that in overall the Markov regime exchanging model accomplishes ISSN:0975-887

well than associated models in the dayahead point and interval prediction of the solar radiance. Ali Chikh and Ambrish Chandra [3] planned An Optimal Extreme Power Point Tracking Algorithm for PV Systems With Climatic Parameters Estimation, System suggested a approach Maximum Power Point Tracking (MPPT) method for photovoltaic (PV) schemes with concentrated hardware setup. It is understood by computing the immediate conductance and the interchange conductance of the array. The first one is done by means of the array electrical energy and current, whereas the 2nd one, which is a function of the array junction current, is predictable by means of an adaptive neuro-fuzzy (ANFIS) solar cell model. Meaningful the problems of determining solar radiation and cell temperature, since those need2 extra sensors that will rise the hardware circuitry and dimension noise, analogical model is planned to estimation them with a de-noising based wavelet algorithm. This method supports to decrease the hardware setup using only one voltage sensor, while rises the array power efficacy and MPPT response time. user friendly daily attendance system that is easy to manage, maintain and query. Our primary focus is to develop a paperless system that provides the management a way to facilitate smoother functioning of the mess system. 4. PROPOSED WORK 4.1 PROJECT SCOPE The product is an android application used to manage daily mess attendance along with streamlining rebate and menu selection processes. Objective of the system is to provide a user friendly daily attendance system that is easy to manage, maintain and query. Our primary focus is to develop a paperless system that provides the management a way to facilitate smoother functioning of the mess system.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 46

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

4.2 Method And Results In total three surveys and one experiment were conducted. The first survey was a questionnaire survey to explore what usability problems users experienced in the Netherlands and South Korea. This study resulted in thecategorization of soft usability problems. The second survey investigated how user characteristics are related to the occurrence ofspecific soft usability problems. Finally, an experiment was conducted to find out how user characteristics are correlated to specific soft usability problems depending on type of product in the USA, South Korea and the Netherlands. Based on the findings from the studies, an interaction model (PIP model: Product-Interaction-Persona model) were developed which provides insight into the interaction between user characteristics, product properties, and soft usability problems. Based on this PIP model a workshop and an interactive tool were developed. Companies can use the PIP model to gain insights into probable usability problems of a product they are developing and the characteristics of those who would have problems using the product. 4.3 Design & Implementation Constraints This protocol is implemented in Java language. We also use HTTP/TCP/IP protocols.Java has had a profound effect on the Internet. The reason for this is Java expands the universe of objects that can be about freely on the Internet. There are two types of objects we transmit over the network, passive and dynamic.Network programs also present serious problems in the areas of security and portability. When we download a normal program we risk viral infection. Java provides a firewall to overcome these problems. Java concerns about these problems by applets. By using a Java compatible

ISSN:0975-887

Web browser we can download Java applets without fear of viral infection. 4.4 Functional Requirement System must be fast and efficient User friendly GUI Reusability Performance 5. FIGURES/CAPTIONS :

Fig: System Architecture

6. ALGORITHM Collect data from sensors(Time Series technique) Measure energy level Store the data in data set For solar energy forecasting(Mycielski-Markov Model) If data match with historical data gives accurate result Else select probable prediction gives highly possible result Send hourly notification of status of solar panel Ideal solution to increase efficiency of solar plant monitoring with detection of failure helps to consume more energy and accuracy in prediction of solar radiation.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 47

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Time Series Technique collection of data points at constant time intervals.these are analyzed to determine the long term trend so as to forecast the future. Mycielski-Markov Model needs only historical solar data without any other parameters.repeatedness in the history directly gives accurate results.

help the user to compute the condition of many constraints in the solar PV PCU. Applying Renewable Energy technologies is one suggested way of falling the environmental effect. Because of numerous power cut it is important to use renewable energy and monitoring it. Monitoring guides the user in scrutiny of renewable energy usage. This system is cost effective. The system efficacy is about 95%.This allows the proficient use of renewable energy. Thus it is falling the electricity matters REFERENCES

7. ACKNOWLEDGMENTS With due respect and gratitude, we take the opportunity to thank to all those who have helped us directly and indirectly. We convey our sincere thanks to Prof. P. N. Mahalle, HoD, Computer Dept. and PROF. D. H. Kulkarni for their help in selecting the project topic and support. Our guide PROF. D. H. Kulkarni has always encouraged us and given us the motivation to move ahead. He has put in a lot of time and effort in this seminar along with us and given us a lot of confidence. We wish to extend a big thank to him for the same.Also, we wish to thank all the other people who in any smallest way have helped us in the successful completion of this project. 8. CONCLUSION The solar PV PCU observing using Internet of Things has been experimentally sure to work satisfactorily by monitoring the constraints effectively through the internet. The planned system not only monitors the parameter of solar PV PCU , but it also operate the data and create the report according to the requirement, for example estimation unit plot and create total units produced per month. It also stores all the constraints in the cloud in a timely manner. This will ISSN:0975-887

[1] Day-ahead Prediction of Bi-hourly Solar Radiance with a Markov Switch Approach, Yu Jiang, Huan Long, Zijun Zhang, and ZheSong ,IEEE Transactions on Sustainable Energy,2017, DOI 10.1109 [2] An Optimal Maximum Power Point Tracking Algorithm for PV Systems With Climatic Parameters Estimation ,AliChikh and Ambrish Chandra, IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2015,DOI 10.1109 [3] Critical weather situations for renewable energies e Part B: Low stratus risk for solar power, Carmen K€ohler , Andrea Steiner, Yves-Marie Saint-Drenan, Dominique Ernst, Anja Bergmann-Dick, Mathias Zirkelbach, Zied Ben Bouall_egue , Isabel Metzinger , Bodo Ritter Elsevier,Renewable Energy(2017),http://dx.doi.org/10.1016/j.rene ne.2016.09.002 [4] Sentinella: Smart Monitoring of Photovoltaic Systems at Panel Level-Bruno Andò, Senior Member, IEEE, Salvatore Baglio, Fellow, IEEE, Antonio Pistorio, Giuseppe Marco Tina, Member, IEEE, and Cristina Ventura, 0018-9456 © 2015 IEEE , DOI 10.110 [5] Monitoring system for photovoltaic plants: A review-SivaRamakrishnaMadeti n, S.N.Singh Alternate Hydro Energy Centre, IndianInstituteofTechnologyRoorkee, Uttarakhand247667,India RenewableandSustainableEnergy Reviews67(2017)1180– 1207, http://dx.doi.org/10.1016/j.rser.2016.09.088 [6] Design and implementation of a solar plant and irrigation system with remote monitoring and remote control infrastructures ,YasinKabalci , ErsanKabalci , RidvanCanbaz , AyberkCalpbinici, Elsevier, Solar Energy 139(2016), [7] Forecasting of solar energy with application for a growing economy like India: Survey and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 48

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[8]

implication, Sthitapragyan Mohanty, Prashanta K. Patra, Sudhansu S. Sahoo, AsitMohanty .Elsevier, Renewable and Sustainable Energy Reviews 78(2017) , http://dx.doi.org/10.1016 Utility scale photovoltaic plant indices and models for on-line monitoring and fault detection purposes, Cristina Ventura, Giuseppe Marco Tina, Elsevier, Electric

ISSN:0975-887

Power Systems Research 136 (2016), http://dx.doi.org/10.1016/j.epsr.2016.02.006 [9] Improving the performance of power system protection using wide area monitoring systems.Arun G. PHADKE1, Peter WALL2, Lei DING3, Vladimir TERZIJA2, Springer, J. Mod. Power Syst. Clean Energy (2016), DOI 10.1007/s40565-016-0211-x.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 49

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SMART AGRICULTURE USING INTERNET OF THINGS Akshay Kudale1, Yogesh Bhavsar2, Ashutosh Auti3, Mahesh Raykar4, Prof. V. R. Ghule5 1,2,3,4,5

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering Savitribai Phule Pune University, Vadgaon(Bk), Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT Today agriculture is inserted with propel benefit like GPS, sensors that empower to impart to each other break down the information and further more trade information among them. IT gives benefit as cloud to farming. Internet of Things plays an important role in smart farming. Smart farming is an emerging concept, because of IoT sensors which are capable of providing information about agriculture field conditions. The combination of traditional methods with latest advancements in technologies as Internet of Things and WSNs can lead to agriculture modernization. The Wireless Sensor Network which collects the data from different types of sensors and send it to the main server using wireless protocol. There are many other factors that affect the productivity to great extent. Factors include attack of insects and pests which can be controlled by spraying the proper insecticide and pesticides and also attack of wild animals and birds when the crop grows up. The crop yield is declining because of unpredictable monsoon rainfalls, water scarcity and improper water usage. The developed system is more efficient and beneficial for farmers. It gives the information about the temperature, humidity of the air in agricultural field and other soil nutrients through mobile application to the farmer, if it fallout from optimal range. The application of such system in the field can definitely help to advance the harvest of the crops and global production. General Terms Internet of Things (Iot), Machine Learning, Passive Infrared Sensor (PIR) increasing the yield. The proposed system 1. INTRODUCTION Agriculture is considered as the basis of collects the data from various sensors and life for the human species as it is the it provides the information about different main source of food grains and other raw environmental factors which in turns helps materials. It plays vital role in the growth to monitor the system. Monitoring of country‘s economy. It also provides environmental factors is not enough and large ample employment opportunities to complete solution to improve the yield the people. Growth in agricultural sector of the crops. There are number of other is necessary for the development of factors that affect the productivity to great economic condition of the country. extent. These factors include attack of Unfortunately, many farmers still use the insects and pests which can be controlled traditional methods of farming which by spraying the crop with proper results in low yielding of crops and fruits. insecticide and pesticides. Secondly, But wherever automation had been attack of wild animals and birds when the implemented and human beings had been crop grows up. There is also possibility of replaced by automatic machineries, the thefts when crop is at the stage of yield has been improved. Hence there is harvesting. Even after harvesting, farmers need to implement modern science and also face problems in storage of harvested technology in the agriculture sector for crop. So, in order to provide solutions to ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 50

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

all such problems, it is necessary to develop integrated system which will take care of all factors affecting the productivity in every stages like; cultivation, harvesting and post harvesting storage. This proposed system is therefore useful in monitoring the field data as well as controlling the field operations which provides the flexibility. 2. MOTIVATION Agriculture is the basis for the human species as it is the main source of food and it plays important role in the growth of country‘s economy. Agriculture is the prime occupation in our country. 65% of our country's population works in agriculture sector. Agriculture sector contributes in 20% of GDP of our country. Farmers use traditional methods for farming which results in reducing the quality of yields. Traditional methods reduce the quantity of crops further reducing the net profit generated. Farmers have insufficient information about soil, appropriate water level, atmospheric conditions which lead to crop degradation. With the help of Internet of things, we can overcome these drawbacks and can help farmers in reducing their efforts and increasing the crop production. Using smart IoT system, farmer can increase yield and net profit generated in field. 3. STATE OF ART Table 1. Literature survey Year

2016

2017

2015

Author Nikesh Gondchawar, Prof. Dr. R. Kawitkar

Title

IoT

Tanmay Baranwal, Nitika, Pushpendra Kumar Pateriya Nelson Sales, Orlando Remedios, Artur Arsenio

ISSN:0975-887

based

S. Development of IoT based Smart Security and Monitoring Devices for Agriculture. Wireless Sensor and Actuator System for Smart Irrigation on the Cloud.

2017

Prathibha S , Anupama Hongal , Jyothi M

IoT Based Monitoring System In Smart Agriculture.

4. GAP ANALYSIS Parameter

Advantages

Disadvanta ge

Irrigation automation using Iot 1. Data collected by sensors will help in deciding ON and OFF of irrigation system. 2. Remote controlling of system reduces 1. All farmer‘s parameters of efforts. soil are not considered while automating irrigation. 2. System is not reliable in some cases as it fails to provide correct output.

Intelligent Security and Warehouse Monitoring device 1. The system can be controlled and monitored from remote location. 2. Threats of rodents and thefts can be easily detected.

1. System doesn‘t identify and categorize between humans, mammals and rodents. 2. System doesn‘t satisfy all test cases and this increases the threat of not transforming detecting rodents and to systems thefts.

5. PROPOSED WORK The project will help in and reorienting agricultural effectively support development and ensure food security in changing climate. Project is based on the consideration that the proposed system will help in increasing quality and quantity of yield. System will gather the information about climate change, soil nutrients, etc. using the sensors installed in field, to predict the suitable crops for that climate conditions. Smart Agriculture This system will continuously monitor the field and will suggest suitable actions. Smart Warehouse system will detect and differentiate between humans and rodents and will trigger that alerts. Assumptions and Dependencies In the proposed system there are various assumptions which are important for the working of the proposed system. It is important that the data gathered by

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 51

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

sensors should be correct. Data collected by sensors is assumed to be same in all areas of the field. The arrangement of the whole system is unchanged and secured. Warehouse security system will differentiate between rodents and humans based on size. The proposed system depends on the consideration that users have good internet connection and local system should have power supply. Also user should have mobile application installed where alerts will be provided. Requirements The functional requirement of system includes the data gathered by the sensors and the decisions which are taken on the basis of this data. The data provided by sensors can contain some noise, so it requires refining that data. The processing model installed on cloud platform will take this refined data as input and then it will take decisions based on the dataset values. Accordingly, alerts will be provided to farmers through mobile applications. User of this system is a farmer. So we have to design the application accordingly. System must provide reliable alerts to user which will help him in making decisions and taking actions about the field. Steps Involved

Fig. 1. Steps Involved

As shown in Fig.1 the model will proceed in three steps. Which are Colleting the data from field using sensors, processing the collected data on

ISSN:0975-887

cloud platform and providing suggestions to farmers through mobile application. System Design

Fig.2. DFD level 0

Fig.3. Data Flow Diagram level 1

The Fig, 2 and Fig.3 shows the data flow diagram of the system. It shows the graphical representation of flow of data through an information system. It also shows the preliminary step to create an overview of the system. DFD level 0 shows three components - farmers, local system and administrator which interact with the model. DFD level 1 describes the function through which the farmers, local system and administrators interact with the system. Local system can collect data using sensors. Farmers can request and view their data on the system. Administrator manages the data stored. Other Specification The proposed system provides advantages in terms of increasing the quality and quantity of yield and reducing the risk factor of damage caused by natural calamities. Also this system will help in improving soil fertility and soil nutrients, increasing net

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 52

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

profit of farmers and reducing efforts of farmers. This system will promote the smart farming techniques. This system has some limitations in terms of requirement of constant power supply and stable internet connection. Also farmer should be able to use smartphone. Farmer should afford the cost of the proposed system. 6. CONCLUSION AND FUTURE WORK Internet of Things is widely used in connecting devices and collecting information. All the sensors are successfully interfaced with raspberry pi and wireless communication is achieved between various nodes. All observations and experimental tests proves that project is a complete solution to field activities, environmental problems, and storage problems using smart irrigation system and a smart warehouse management system. Implementation of such a system in the field can definitely help to improve the yield of the crops and overall production. The device can incorporate pattern recognition techniques for machine learning and to identify objects and categorize them into humans, rodents and

ISSN:0975-887

mammals, also sensor fusion can be done to increase the functionality of device. Improving these perspectives of device, it can be used in different areas. This project can undergo for further research to improve the functionality of device and its applicable areas. We have opted to implement this system as a security solution in agricultural sector i.e. farms, cold stores and grain stores. REFERENCES Nikesh Gondchawar, Prof. Dr. R. S.Kawitkar, IoT based Smart Agriculture International Journal of Advanced Research in Computer and Communication Engineering Vol. 5, Issue 6, ISSN (Online) 2278-1021 ISSN (Print) 2319 5940, June 2016. [2] Tanmay Baranwal, Nitika , Pushpendra Kumar Pateriya Development of IoT based Smart Security and Monitoring Devices for Agriculture 6th International Conference Cloud System and Big Data Engineering, 9781-4673-8203-8/16, 2016 IEEE. [3] Nelson Sales, Artur Arsenio, Wireless Sensor and Actuator System for Smart Irrigation on the Cloud 978-1- 5090-0366-2/15, 2nd World forum on Internet of Things (WF-IoT) Dec 2015, published in IEEE Xplore Jan 2016. [4] Prathibha S R1, Anupama Hongal 2, Jyothi M P3 IoT based Monitoring System In Smart Agriculture 2017 International Conference on Recent Advances in Electronics and Communication Technology. [1]

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 53

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

AREA-WISE BIKE POOLING- ―BIKEUP‖ Mayur Chavhan, Amol Kharat, Sagar Tambe, Prof. S.P Kosbatwar Department of Computer Engineering, Smt. Kashibai Navale College of Engineering Pune,India [email protected], [email protected], [email protected], [email protected]

ABSTRACT ―This study summarizes the implementation of bike pooling system and it services. Functionalities as proving low cost travelling for middle class families who is not afford expense on travelling. This system also very useful in rural area whereas transport vehicles are less in number.‖ Keywords—Bidding, Auction. 1. INTRODUCTION Now-a-days, cabs are in great transportation demand. The concept of ola and uber is that we book a ride through their application, by providing the pickup point as well as destination point. According to need of people they used their transportation means, Ex:  

A person riding bike uses bike. A person driving car uses car etc.

But besides all this addition of cabs results in more traffic. So we are developing an application namely, ―BikeUp‖ which will help in traffic reduction where we will be using two wheelers to provide transportation services to the people. Ex: If the single person want to travel he will also have to book a ride for cabs, auto except two wheelers. So here, We are using private two wheelers as a public transport i.e the person riding bike as well as the end user will need an application installed in their mobiles then the end user and the rider both will entered their destination points and pickup point will be generated using Google Map. A broadcast message from the biker will sent after which the controller of the application will match the destination point of both person occurring in particular ISSN:0975-887

range of 100 m to 5 km and if the destination points matches the request from the end user side will send and thus accepted by bikers, and at the end user by applying the essential charges to apply end user and thus drop the end user to required location. Here we will use the first law of business is ―Use Public Investment for Business‖. 2. MOTIVATION The motivation for doing this project was primarily an interest in undertaking a challenging project in an interesting area. The observation towards the increasing traffic gave a thought to develop such project which will lead to a decrease in traffic as well provide efficient transportation means to people. This will help to lower down the tremendously increasing pollution. This will also be very useful for common people who cannot afford for cabs. It will be of great help for controlling the day increasing the traffic as well as pollution observing both these factors, gave an idea for this project.

3. STATE OF ART Paper Title: The mobile applications development is composed by three groups: natives, hybrids and web. In this paper a comparison between the native and hybrid mobile applications build on JavaScript (Reactive Native, Native Script and Ionic) is done. The analysis is done using the 7

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 54

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

more relevant principles to the mobile applications development. This paper shows that React Native exhibits the best results in all the analyzed principles and still having benefits in the hybrid development in relation to native. With the emergence of frameworks for mobile development, some of them with a little more than a year of existence, there is the difficulty to perceive which are the most advantageous for a given business objective, this article shows the best options among the frameworks used, always comparing with the native development. Paper Title: Among the various impacts caused by high penetration distributed generation (DG) in medium and low voltage distribution networks, the issues of interaction between the DG and feeder equipment, such as step voltage regulators (SVRs), have been increasingly brought into focus of computational analyses and real-life case studies. Particularly, the SVR's runaway condition is a major concern in recent years due to the overvoltage problem and the SVR maintenance costs it entails. This paper aims to assess the accuracy of the quasistatic time series (QSTS) method in detailing such phenomenon when compared to the classical load flow formulation. To this end, simulations were performed using the OpenDSS software for two different test-feeders and helped to demonstrate the effectiveness of the QSTS approach in investigating the SVR's runaway condition. Paper Title: Autonomous Bidding Agents in the Trading Agent Competition Abstract: Designing agents that can bid in online simultaneous auctions is a complex task. The authors describe taskspecific details and strategies of agents in a trading agent competition. More specifically, the article describes the taskspecific details of and the general motivations behind, the four top-scoring agents. First, we discuss general strategies ISSN:0975-887

used by most of the participating agents. We then report on the strategies of the four top-placing agents. We conclude with suggestions for improving the design of future trading agent com-petitions Paper Title: The opportunistic large array (OLA) with transmission threshold (OLAT) is a simple form of cooperative transmission that limits node participation in broadcasts. Performance of OLA-T has been studied for disc-shaped networks. This paper analyzes OLA-T for stripshaped networks. The results also apply to arbitrarily shaped networks that have previously limited node participation to a strip. The analytical results include a condition for sustained propagation, which implies a bound on the transmission threshold. OLA transmission on a strip network with and without a transmission threshold are compared in terms of total energy consumption. 2. gap analysis Standard Platform:  It is an standard android application or ios application.  All the API are pure platform dependent for ola and uber.  There is no such algorithm which support for cross-platform for each platform there is different algorithms.  No current system availed for two wheeler transportation.  Some company provide such services but they don‘t have proper implementation of these system.  In rural area the transportation services is negligible. BikeUp:  It is cross-platform algorithm which is use for many platform.  This API are pure platform independent for various devices like web application, android application,ios application‘  This system will increase employability in rural area by

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 55

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

adding there bike in System.

BikeUp

4. PROPOSED WORK Transport plays a vital role in economic and human development. In the initial phases of development of an economy, transport requirements tend to increase at a considerably higher rate than the growth of economy. In India, during 1990 to 2005, the rail freight traffic increased nearly two and half times and traffic by road almost five times. Traffic congestion increases vehicle emissions and degrades ambient air quality, and recent studies have shown excess morbidity and mortality for drivers, commuters and individuals living near major roadways.

Figure 1: Survey of various Vehicles

In the figure 1, It shows the various vehicle that produce pollution in percentage. The lowest pollution rate of bike quite larger than trucks because of trucks have very minimum numbers as compare to bike. By using the bike it really help to reduce pollution. Regarding passenger traffic, road traffic increased almost three times. Recently it is reported that road traffic would account for 87% of passenger traffic and 65% of freight traffic. The increase in road traffic had direct implication on pollution. In Delhi, the vehicular pollution was increasing since 2000. An entity relationship diagram (ERD) shows the relationships of entity sets ISSN:0975-887

stored in a database. An entity in this context is an object, a component of data. An entity set is a collection of similar entities. These entities can have attributes that define its properties.

Figure 4.4:ER Diagram

In figure 4.4, Customer entity contains cust_id, name, gender, destination, mob_no this detail will stored in table and provide to match_destination action. Biker entity contails vehical_no, mob_no, gender, bike_name, biker_id, destination from this attribute destination address is need for matching_destination action between customer and biker entity. 5. CONCLUSION AND FUTURE WORK In this paper we proposed a method and apparatus for managing bidding process for services in india. a platform for connecting service provider to clients, to improve the local markets in india. this web site has an intuitive interface and unique visual objects that make it friendly for use. online auction will provide a way to connect service providers and consumers. india needs a platform to connect small businesses which lays foundation of indian economy. this platform will work for the same. the platform will enable small businesses to connect to the peoples who need their services.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 56

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

the future work may lay its emphases on exploration of the various methods and applications of blockchain in auction by overcoming its limitations. more layers of hybrid functions can be included for further increase in data integrity and security. REFERENCES [1] J.-N. Meier, A. Kailas, O. Abuchaar et al., "On augmenting adaptive cruise control systems with vehicular communication for smoother automated following", Proc. TRB Annual Meeting, Jan. 2018. [2] Dan Ariely(2003) Buying, Bidding, Playing, or Competing? Value Assessment. [3] Amy Greenwald (2001) Autonomous Bidding Agents in the Trading Agent Competition.

ISSN:0975-887

[4] Chia-Hui Yen(2008) Effects of e-service quality on loyalty intention: an empirical study in online auction. [5] A. Kailas, L. Thanayankizil, M. A. Ingram, "A simple cooperative transmission protocol for energy-efficient broadcasting over multi-hop wireless networks", KICS/IEEE Journal of Communications and Networks (Special Issue on Wireless Cooperative Transmission and Its Applications), vol. 10, no. 2, pp. 213-220, June 2008. [6] Y. J. Chang, M. A. Ingram, "Packet arrival time estimation techniques in software defined radio", preparation. [7] B. Sirkeci-Mergen, A. Scaglione, "On the power efficiency of cooperative broadcast in dense wireless networks", IEEE J. Sel. Areas Commun., vol. 25, no. 2, pp. 497-507, Feb. 2007.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 57

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SMART WATER QUALITY MANAGEMENT SYSTEM Prof. Rachana Satao, Rutuja Padavkar, Rachana Gade, Snehal Aher, Vaibhavi Dangat Dept. of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune [email protected] , [email protected] ABSTRACT Water pollution is one of the biggest threats for the green globalization. Water pollution affects human health by causing waterborne diseases. In the present scenario, water parameters are detected by chemical tester laboratory test , where the testing equipment‘s are stationary and samples are provided to testing equipment‘s. In this paper, the design of Arduino based water quality monitoring system that monitors the quality of water in real time is presented. This system consists of different sensors which measures the water quality parameter such as pH, conductivity, muddiness of water , temperature. Keywords WSN : Wireless Sensor Network pH : potential of Hydrogen RM: Relay Module The system proposed is a water quality 1. INTRODUCTION The quality of drinking water is essential monitoring system in the Arduino platform for public health. Hence, it‘s necessary to that measures the pH , conductivity, prevent any intrusion into water temperature, and presence of suspended distribution systems and to detect items on the water bodies like lakes and pollution as soon as possible, whether rivers using sensors.These sensed intentional or accidental . The protection parameters are sent to the authorized of the visible assets (water storage tank, person via server system in the form of pumping station, treatment centers, etc.) messages, so that proper action can be can be realized by traditional intrusion taken by the authority in cleaning the water detection. As a result, the network bodies to reduce the possible health becomes more difficult to protect. In problem that could occur. recent years, assistance and research programs have been developed to 2. MOTIVATION improve the safety and security of drinking water systems and enhance Traditional water quality monitoring capability of system monitoring, sensors involves three steps namely water are placed which monitor various sampling, Testing and investigation. These parameters of quality of water in timely are done manually by the scientists. This manner[1]. Various algorithms takes into technique is not fully reliable and gives no account the variable characteristics of the indication before hand on quality of water quality parameters[5][6]. Also there water. Also with the advent of wireless are systems which are developed that can sensor technologies, some amount of evaluate two to three parameters of water research carried out in monitoring the using PH sensors, turbidity sensors , water quality using wireless sensors temperature sensor, s::can and eventlab for deployed in water and sending short contamination detection [2][14][4]. message to farmer‘s about water. Also Our project provides a new water quality research been carried out in analyzing the monitoring system for water distribution quality of water using machine learning network based on wireless sensor network algorithms too. (WSN). 3. LITERATURE SURVEY ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 58

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Design and Implementation of Cost Effective WaterQuality Evaluation System in 2017: In this research project, a system is developed and discussed which can evaluate the three parameters of water. The detection of water parameter could reduce the rate of illness and unnecessary death as well as create consciousness to people for healthier life. 2. Smart sensor detection for water quality as anticipation of disaster environment pollution in 2016: Water quality is good for water from the local government water company of Surabaya and Malang; mountain spring water, wells water in Malang; and aqua water. Water quality is less good for wells water in Surabaya. While poor water quality for tap water mixed with soap. 3. Smart Technology for Water Quality Control Feedback about use of water quality sensors in 2017: This project presented analysis of the use of two smart sensors (S::CAN and EventLab) for early detection of water contamination. The performances of these sensors were first verified using a pilot station. Then , these sensors were installed in the distribution network of the Scientific Campus of the University of Lille. Recorded data showed quasiconstant signals. Some events were detected. They coincided with the start of water consumption. A comparison between recorded data and laboratory analyses confirmed the good performances of the tested sensors. The demonstration program continues in order to enhance our experience with these innovative water quality sensors. 4. A Centrifugal Microfluidic-Based Approach for Multi- Toxin Detection for Real-Time Marine Water-Quality Monitoring in 2017: To sustain a rapidly increasing of population growth, the global demand for clean, safe water supplies has never been more apparent. It has been previously reported, and predicted, that anthropogenic 1.

ISSN:0975-887

environmental impacts will continue to increase the prevalence and duration of harmful freshwater cyanobacterial and algae blooms. Human, ecological and economic health can all be negatively impacted by harmful cyanobacterial blooms formed due to eutrophication. 5. Towards a water quality monitoring system based on wireless sensor networks in 2017: We proposed an efficient anomaly detection algorithm centralized in the sink node where a global and coherent water quality should be obtained from the measurements taken locally. The algorithm takes into account the variable characteristics of the water quality parameters. Indeed, one of the water quality parameters like pH can suddenly exceed the standard thresholds during a measurement window and then it keeps standard values. 4. STATE OF ART To enhance capability of system monitoring, sensors are placed which monitor various parameters of quality of water in timely manner. Also development of cloud environment for storage of real time data of Water quality from sensors in real pipeline network can be done[1]. In [3] authors proposed that water quality is good for water from the local government water company of Surabaya and Malang; mountain spring water, wells water in Malang; and aqua water. Water quality is less good for wells water in Surabaya. While poor water quality for tap water mixed with soap. In [14]authors proposed a research project, a system is developed and discussed which can evaluate the three parameters of water. The detection of water parameter could reduce the rate of illness and unnecessary death as well as create consciousness to people for healthier life. In [2][4]authors proposed systems which are developed that can evaluate two to three parameters of water

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 59

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

using PH sensors, turbidity sensors, temperature sensor, s::can and eventlab for contamination detection. 5. GAP ANALYSIS The use of various sensors was proposed in different systems. To enhance capability of system monitoring, sensors which monitor various parameters of quality of water in timely manner were used[1]. Then in another research project, a system was developed which can evaluate the three parameters of water . The detection of water parameter could reduce the rate of illness and unnecessary death as well as create consciousness to people for healthier life[14]. An efficient anomaly detection algorithm was proposed centralized in the sink node where a global and coherent water quality should be obtained from the measurements taken locally. The algorithm takes into account the variable characteristics of the water 4. quality parameters. Indeed, one of the water quality parameters like pH can suddenly exceed the standard thresholds during a measurement window and then it keeps standard values[7]. 6. PROPOSED WORK

The proposed system consists of 3 major stages.At the first i.e Sensing stage, Computing and controlling and Communication stage. The proposed system consists of 3 major stages. Sensing stage, Computing and controlling and Communication stage.The system is a water quality monitoring system in the Arduino platform that measures the pH, conductivity, temperature, and presence of suspended items on the water bodies like lakes and rivers using sensors.These sensed parameters are sent to the authorized person via server system in the form of messages, so that proper action can be taken by the authority in cleaning the water bodies to reduce the possible health problem that could occur .All switching ON/OFF is remotely done by the RM. 7. CONCLUSION An electronic system is designed to control and monitor the level of water in a tank. A similar reservoir based on the water detector sensor information. The electronic system is designed to automatically control and display water levels . The proposed system eliminates manual monitoring and controlling for home, agricultural or industrial users. The system achieves proper water management and enhances productivity from automation. 8. FUTURE WORK Water is a key element for the human survival but uneasy and unsustainable because of patterns of water consumption. Usage are still evident in our practical life.There is a strong need to change this pattern of sustainability.The world would indeed cease to exist without the availablility of water. REFERENCES

Fig : System Implementation Plan

ISSN:0975-887

[1] Vijay, Mahak, S. A. Akbar, and S. C. Jain. "Chlorine decay modelling with contamination simulation for water quality in smart water grid." In 2017 International Conference on

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 60

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[2]

[3]

[4]

[5]

[6]

[7]

Energy, Communication, Data Analytics and Soft Computing (ICECDS), pp. 3336-3341. IEEE, 2017. Pawara, Sona, Siddhi Nalam, Saurabh Mirajkar, Shruti Gujar, and Vaishali Nagmoti. "Remote monitoring of waters quality from reservoirs." In Convergence in Technology (I2CT), 2017 2nd International Conference for, pp. 503-506. IEEE, 2017. Putra, Dito Adhi, and Tri Harsono. "Smart sensor device for detection of water quality as anticipation of disaster environment pollution." In Electronics Symposium (IES), 2016 International, pp. 87-92. IEEE, 2016. Saab, Christine, Isam Shahrour, and Fadi Hage Chehade. "Smart technology for water quality control: Feedback about use of water quality sensors." In Sensors Networks Smart and Emerging Technologies (SENSET), 2017, pp. 1-4. 2017. Borawake-Satao, Rachana, and Rajesh Prasad. "Mobility Aware Path Discovery for Efficient Routing in Wireless Multimedia Sensor Network." In Proceedings of the International Conference on Data Engineering and Communication Technology, pp. 673-681. Springer, Singapore, 2017. Borawake-Satao, Rachana, and Rajesh Prasad. "Comprehensive survey on effect of mobility over routing issues in wireless multimedia sensor networks." International Journal of Pervasive Computing and Communications 12, no. 4 (2016): 447-465. Jalal, Dziri, and Tahar Ezzedine. "Towards a water quality monitoring system based on wireless sensor networks." In Internet of Things, Embedded Systems and Communications (IINTEC), 2017

International Conference on, pp. 38-41. IEEE, 2017. [8] Shirode, Mourvika, Monika Adaling, Jyoti Biradar, and Trupti Mate. "IOT Based Water Quality Monitoring System." (2018). [9] Getu, Beza Negash, and Hussain A. Attia. "Electricity audit and reduction of consumption: campus case study." International Journal of Applied Engineering Research 11, no. 6(2016): 4423-4427. [10] Attia, Hussain A., and Beza N. Getu. "Authorized Timer for Reduction of Electricity Consumption and Energy saving in Classrooms." I JAER 11, no. 15 (2016): 84368441. [11] Getu, Beza Negash, and Hussain A. Attia. "Automatic control of agricultural pumps based on soil moisture sensing." In AFRICON, 2015, pp. 1-5. IEEE, 2015. [12] Bhardwaj, R. M. "Overview of Ganga River Pollution." Report: Central Pollution Control Board, Delhi (2011). [13] NivitYadav, "CPCB Real time Water Quality Monitoring", Report: Center for Science and Environment, 2012 [14] Faruq, Md Omar, Injamamul Hoque Emu, Md Nazmul Haque, Maitry Dey, N. K. Das, and Mrinmoy Dey. "Design and implementation of cost effective water quality evaluation system." In Humanitarian Technology Conference (R10- HTC), 2017 IEEE Region 10, pp. 860-863, IEEE, 2017. [15] Le Dinh, Tuan, Wen Hu, Pavan Sikka, Peter Corke, Leslie Overs, and Stephen Brosnan. "Design and deployment of a remote robust sensor network: Experiences from an outdoor water quality monitoring network." In Local Computer Networks, 2007. LCN 2007. 32nd IEEE

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

2007.

Page 61

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

INTELLIGENT WATER REGULATION USING IOT Shahapurkar Shreya Somnath1, Kardile Prajakta Sudam2, Shipalkar Gayatri Satish3, Satav Varsha Subhash4 1,2,3,4 Computer Engineering SCSMCOE, Nepti, Ahmednagar, India [email protected],[email protected], [email protected],[email protected] ABSTRACT The proposed system is implemented with the help of IOT to reduce the issue of wastage of water and provides monitoring and controlling level of water in particular water tank. To implement this system we used android. With the help of this android application we can record the temperature of water, availability of water in the form of water level by using temperature sensor and water level sensor respectively as well as we provide the automatic ON/OFF motor functioning to reduce manual work. Keywords IOT Device , Water level Sensor, Android Application.. Step1: Input data: The first step in 1. INTRODUCTION To Live water is very important functioning to take an initial input data aspect for each and every living things. from the level sensor. Not only for human beings but also for Step2: After sensing the input data from animals, plants. By the survey there is 71% the level sensor select the level of water surface of earth is covered by water But with the help of level sensor. reality is that there is approximately only Step3:Analog data is processed with the 2% of water is fresh water we can use for help of Arduino UNO board and generate drinking which is very less as compare to the digital output. todays world population. Step4:Generated output is send to android Nowadays we can see that in ruler and application via the wifi. urban areas there is lots of water is wasted Step5: Motor ON-OFF automation are because of overflow, leakage of water. In done with the help of relay. existing system the management of water Step6:status and value of output are wastage is handle manually but sometimes display on the Android app. because of some reasons like unavailability of person or there is no 2. SYSTEM ARCHITECTURE proper medium for communicating with person to alert about wastage of water or leakage of water. That‘s why because of these problems day by day the ratio of wastage of water increases. To overcome the problems we implement the proposed system. In this proposed system we overcome the problem related to wastage of water, leakage of water, overflow of water as well as we provide the functionality like known the level of water, measure the temperature of water automatically by using android application With the help of IOT. .Algorithm Fig. 1 System Architecture

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 62

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. FLOWCHART

Fig. 2 Flowchart of the system

. Advantages       

Using IOTt user can directly control and monitor the working of tank through the smartphone. User can operate from any place in the world. Project can be installed in existing water tanks with no requirement new for this purpose. No need to take care of cleaning of water tank . System will automatically generate an alert. Zero majority of water wastage. Project can be installed in existing water tanks with no requirement new tank for this purpose. .

5. ACKNOWLEDGEMENT We are thankful to Prof. Lagad J. U., Prof. Tambe R.,Prof. Jadhav H. , Department of Computer Engineering, Shri Chhatrapati Shivaji Maharaj College Of Engineering ..

ISSN:0975-887

4. CONCLUSION AND FUTURE WORK • This proposed system can be implemented in personal level areas like shcool,colleges, particular industries,private houses or bunglows housing socities, apartments,hospitals ,offices and munipal overhead tanks. •As well as this system will be implemented in large water scale areas like river,dams etc.to determine the level of water,theft of water and preventing the loss of human life,damage of propertirs,destruction of crops,loss of livestocks and determination of health conditions because of flood. •By using the app we will provide the alert message in flood prone areas. •In our proposed system the wastage of water and level of water is controlled and monitered from any location by using simple android application with the help of IOT •The facilities provided in this system: i. Motor ON/OFF facitity.because automation of motor reduse the manual work as well as westage of water because of overflow. ii. with the help of level sensor we determine the water westage because of leakage iii. using android application provide the cleaning status periodically. iv. determine the how much water consumed by in particular region. REFERENCES [1] Y. Xue, B. Ramamurthy, M.E. Burbach, C.L.

Knutson, "Towards a Real-time Groundwater Monitoring Network", Nebraska Water Colloquium, 2007. [2] P. H. Gleick, ―Water resources. In Encyclopedia of Climate and Weather‖, ed. By S. H. Schneider, Oxford University Press, New York, vol. 2, 1996, pp.817-823. [3] J. Ghazarian, T. Ruggieri, A. Balaster, ―Secure Wireless Leak Detection System. World Intellectual Property Organization (WIPO)‖, WO/2009/017512. 2009. [4] C.J. Vörösmarty, P. Green, J. Salisbury, R.B. Lammers, ―Global Water Resources: Vulnerability from Climate Change and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 63

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Population Growth‖, Science, Vol. 289 no. 5477, 14 July 2000, pp. 284-288. [5] I. Podnar, M. Hauswirth, and M. Jazayeri, ―Mobile push: delivering content to mobile

ISSN:0975-887

users,‖ Proceedings 22nd International Conference on Distributed Computing Systems Workshops, p. 563568, 2002.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 64

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SMART NOTICE BOARD Shaikh Tahura Anjum Vazir , Shaikh Fiza Shaukat, Kale Akshay Ashok Student, Ahmednagar, Maharashtra, India [email protected], [email protected], [email protected] ABSTRACT A notice board is a surface intended for the posting of public messages, for example, to advertise items wanted or for sale, announce events, or provide information. Notice boards are mandatory asset used in institutes and organizations or public places. The process of notice board handling is a time consuming and hectic. To overcome this problem a new concept of digital notice board is introduced in this paper. This concept provides digital way of displaying notices using Android application and Wireless technology. General Terms Existing System, Proposed Method, Implementation, Mathematical Model. Keywords Notice Board, Wireless Technology, Android application, Kiosk mode, PHP- Hypertext Preprocessor Android app, but this technique is 1. INTRODUCTION The main concept is to use Liquid time consuming. Crystal Displays (LCD) to display  Updated system for the above notices which are controlled using voice technique includes Arduino board as commands. We have already seen GSM a controller to make use of WiFi based notice board, but voice controlled Technology. As Arduino does not allows extra advantage. The user sends have inbuilt WiFi support external the message from the Android hardware is used. application device, it is received and  No voice command facility was retrieved by the Wireless Fidility (WiFi) provided in any of the above system. device at the display unit. Android application allows user to take voice 3. PROPOSED METHOD commands as input and send it to This section gives a basic overview of Raspberry Pi. This function is carried out the system. Fig. 1 shows the block using WiFi. After receiving the sent text diagram of the system is processed and displayed on the LCD screen connected to Raspberry Pi. The font size is customizable and can display multiple notices at time. Raspberry Pi is used as it allows using PHP templates to display notices. 2. EXISTING SYSTEM  One of the existing system is implemented using Global System for Mobile Communication (GSM) where Short Message Service (SMS) is used to send notices to the controller which limits the data size.  Another existing system uses Bluetooth as mode of data transfer between microcontroller and the ISSN:0975-887

Fig 1: Block diagram of the system

The notice to be displayed is sent from android application using Socket programming in java.  As Wireless transmission is used, large amount of data can be transferred over the network.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 65

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

 Client Server Model is used for communication purpose. Android application is the client who sends notices to server, which is Raspberry Pi.  Server is implemented using Python. The server processes the data and displays it on the screen using PHP templates  Raspberry Pi provides two video output facility. Which is composite Radio Corporation of America (RCA) and High-Definition Multimedia Interface (HDMI).  Video Graphics Array (VGA) port of display screens can be used by using HDMI OUT port of the Raspberry Pi 3 model B with a HDMI to VGA convertor.  Therefore, the proposed method is versatile with respect to display options. 4. IMPLEMENTATION This section explains the execution flow from establishing communication between the Android application and Raspberry pi to displaying the notices on the screen.  As shown in Fig.3, first the message is sent from the application and stored at Raspberry Pi. The message is retrieved and the contents are updated and stored on SD card.  Now the text message is read from the SD card. Fetched text is wrapped in a template and displayed on the screen using browser which is open in kiosk mode.  For the communication to take place, both Raspberry Pi and android application must be connected to same WiFi network. This can be achieved using server side coding in

ISSN:0975-887

Python and making Raspberry Pi as an Access Point.  In case of power failure, after boot up on resumption of power supply, the browser window should open automatically so that the display screen is ready to show the notice.  For aesthetic reasons, the boot messages and the Raspberry pi logo which also appears in the top left corner of the screen can be hidden. 5. MATHEMATICAL MODEL

Fig 2: Mathematical Model

    

M1 sends notice to m2. M2 is the access point which provides the network for m1 to connect. After receiving notice from m1, m3 processes it and includes it in m4, which is PHP template. This processed data is sent from m3 to m5. M5 displays the message on LCD screen.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 66

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Fig 3: Implementation flow chart on 11 June 2014 from http://www.eacomm.com/downloads/produc ts/textbox/wdtgsm.pdf [3] Article titled ―How to hide text on boot‖ retrieved on 20 September 2014 from http://raspberrypieasy.blogspot.in/2013/12/how-hide-textonboot.html [4] Article titled ―How to hide Raspberry Pi LOGO on boot‖ retrieved on 20 September 2014, 11:30 A.M. from http://raspberrypieasy. blogspot.in/2013/12/how-to-hide-raspberrypi-logo-onboost.html [5] Article titled ―Autorun browser on startup‖ retrieved 13 August 2014 from http://www.raspberry-projects.com/pi/pioperating%20system s/raspbian/gui/autorun-browser-on-startup [6] Article titled ―WIFI‖ retrieved on 27 November 2014 9:45 A.M. from https://www.raspberrypi.org/documentation/ configuration/wireless/ [7] Android Application Development Tutorial186 – Voice Recognition Result, The New Boston, YouTube. 7. FUTURE SCOPE http://www.youtube.com/watch?v=8_XW_5 GLCD can be implemented for more JDxpXI. Oct. 2011. advancement Voice call can also be [8] J.M. Noyes and C.R. Frankish, ―Speech added for emergency purpose at public recognition technology for individuals with disabilities,‖ ISAAC. vol. 8, December places Voice messages and buzzer can 1992.. be included to indicate the arrival of new [9] Wireless Networking Basics by NETGEAR, messages especially in educational Inc. 4500 Great America Parkway Santa institutions. Clara, CA 95054 USA. [10] A Message Proliferation System using Short-Range Wireless Devices Department REFERENCES of Information Systems and Media [1] Vinod B. Jadhav, Tejas S. Nagwanshi, Design,Tokyo Denki University. Yogesh P. Patil, Deepak R. Patil. ―Digital Notice Board Using Raspberry Pi‖ IJRET, Volume: 03, Issue: 05 | May-2016. [2] Article named ―Wireless data transmission over GSM Short Message Service‖ retrieved

6. CONCLUSION Current world prefers automation and digitalization in such a way this projectwill be more useful in displaying the messages, videos, pictures in Wireless E-notice board through android app development application by Raspberry Pi. By which the message can be send by the users at anywhere from any location with high data speed. User will be able to provide notices using voice command which will be much easier. Only authorized user will have the access to the system which will provide security and integrity to organization using the system. Thus the notice board will be more efficient in displaying the accurate messages at low cost.

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 67

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

VEHICLE IDENTIFICATION USING IOT Miss YashanjaliSisodia, Mr.SudarshanR.Diwate Asst. Prof. (Department of Computer), G.H.RaisoniCEOM,Maharshtra, SPPU University ,India Asst. Prof. (Department of E&TC), G.H.RaisoniCEOM,Maharshtra, SPPU University ,India [email protected], [email protected], [email protected]

ABSTRACT The aim of the paper is to identify the vehicle which passes through system and RFID device will identify vehicle by using arduinouno. The key element in this system is the passive RFID tags which will be hidden inside the vehicle and act as a unique identification number for the vehicle. The information of all such tags will be maintained by centralized server. When unauthorized vehicle is trying to pass through the gate then gate will not open and for authorized vehicle automatically gate will get open via radio frequency identification of vehicle using IOT. System help to security domain also. police center as an alert for the stolen. 1. INTRODUCTION In earlier in the residential buildings there When a police center receives an alert for is no any system which identifies the stolen vehicles, they can make an action to information about the persons and their prevent this theft. vehicle. So the unknown vehicle enters in Nowadays, it is used either as a residential buildings. By using this system replacement or addition for car alarms to we are going to solve this problem using protect it from theft or it can be used as a IOT. monitoring system to keep track the In this paper we are going to use IOT vehicle at the real time. So, many involves extending internet applications can be used for this purpose to connectivity beyond standard devices, block car's engine or doors as an action to such as desktops, laptops, smartphones and protect the vehicle. Due to the tablets, to any range of advancement in technology vehicle traditionally dumb or non-internet-enabled tracking systems that can even identify and physical devices and everyday objects. detect vehicle's illegal movements and Embedded with technology, these devices then attentive the owner about these can communicate and interact over movements. This gives an advantage over the internet, and they can be remotely the rest applications and other pieces of monitored and controlled. technology that can serve for the same In this paper we are going to use the purpose using IOT. RFID for identification of the vehicle. The system identifies the vehicle and access to 2. LITERATURE REVIEW the gate. The system stores the information Prof. Kumthekar A.V. Ms. SayaliOwhal about the person vehicle who lives in the etc.[1] proposed a system that RFID residential building. The system database technology and information management stores the name, flat number and the are leading tools that are imperative for vehicle number of the person. future sustainable development of Vehicle tracking systems are popular container transportation, not only port among people as are travel device and facilities and transportation but also a theft prevention. The main benefit of manufacturer and retailers. The useful vehicle tracking systems is the security application experiences are extremely purposes by monitoring the vehicle's helpful for RFID widespread and location which can be used as a protection successful adoption in the future. From the approach for vehicles that are stolen by analysis of above-mentioned RFID sending its position coordinates to the container transportation implementation, ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 68

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

some key points can be concluded for further RFID application systems implementation. As information systems play a crucial role in RFID implementation, information system development is essential for RFID project success. And RFID information system should be developed as all open systems that can be easily integrated with others system in supply chains. Security is a critical issue for RFID systems since they manage cargo information that must be protected from theft, modification or destruction. As a new wireless technology that often links to the Internet, security presents additional challenges that must be factored into any installation of RFID systems. Kashif Ali, HossamHassanein[4] presented system successfully merges the RFID readers and their tags with central database, such that all the parking lots in the university can work in fast and efficient manner. The RFID tag provides a secure and robust method for holding the vehicle identify. The web-based database allows for the centralization of all vehicles and owners records. Ivan Muller, Renato Machado de Brito[5]Vehicle tracking systems are popular among people as are travel device and theft prevention. The main benefit of vehicle tracking systems is the security purposes by monitoring the vehicle's location which can be used as a protection approach for vehicles that are stolen by sending its position coordinates to the police center as an alert for the stolen. When a police center receives an alert forstolenvehicles, they can make an action to prevent this theft. Muhammad Tahir Qadri, Muhammad[6]Introduced a new approach that leads to a reconciliation of privacy and availability requirements in anonymous RFID authentication: a generic compiler that mapseach challenge-response RFID authentication protocol into another that supports key-lookup operations in constant cost. If the original protocol were to satisfy ISSN:0975-887

anonymity requirements, the transformed oneinherits these properties. The result improves the prior best bound on worstcase key-lookup cost of O(log n), by Molnar, Soppera and Wagner (2006). They also show that any RFID authentication protocol that simultaneously provides guarantees of privacy protection and of worst-case constant-cost key-lookup must also imply ―public-key obfuscation‟, at least when the number of tags is asymptotically large. Also consider relaxations of the privacy requirements and show that, if limited likability is to be tolerated, then simpler approaches can be pursued to achieve constant key-lookup cost. 3. DESIGNING OF SYSTEM Objective Vehicle tracking has increased in use over the past few years and, based on current trends, this rise should continue. Tracking offers benefits to both private and public sector individuals, allowing for real-time visibility of vehicles and the ability to receive advanced information regarding legal existence and security status. The monitoring system of a vehicle is integration of RFID technology and tracking system using IOT. Theme In this paper Arduino used for controlling all peripherals and activities. Arduino does not require external power supply circuit because Arduino has inbuilt power supply circuit as well it provides additional functionalities compared to any microcontroller like pic, microcontroller 8051. Arduino is more sophisticated compared with other microcontroller In RFID the RFID reader can identify all recognized data from RFID tag and then collected data has showing in terminal of pc in that RFID tag is provided to the all vehicle and data can move towardsto RFID reader via radio frequency range 13.56MHz. Data can help to which one is authorized or unauthorized vehicle. This

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 69

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

whole data will go to the via ESP8266 WiFi module with internet to the mobile. Design In this fig shown the RFID reader, relay is connected to the Arduino Uno. All data can begathered and then stored in ArduinoUno so we can easily access any time anywhere it. According to data it can give response. It can very efficient and reliable for data storing purpose and many things can be analyze in system. RFID reader canbe read info. From RFID tag and relay can be control motor. Motor is connected to one circular rode which can act as a gate so acc. To wholedata it can give response very quickly. All data which can going into the pc and for mobile the ESP8266 is connected to the Arduino. Inmobile the authorized and unauthorized vehicle id no. is sent via ESP8266 Wi-Fi module which is connected to internet.

provides UART TTL (5V) serial communication which can be done using digital pin 0 (Rx) and digital pin 1(TX). An ATmega16U2 on the board channels this serial communication over USB and appearsas a virtual com port to software on the computer. The ATmega16U2 firmware uses the standard USB COM drivers, and no external driver is needed. However, on Windows, an .info file is required. The Arduino software includes a serial monitor which allows simple textual data to be sent to and from the Arduino board. There are two RX and TX LEDs on the Arduino board which will flash when data is being transmitted via the USB-to-serial chip and USB connection to the computer (not for serial communication on pins 0 and 1). A Software Serial library allows for serial communication on any of the Uno's digital pins.The ATmega328P also supports I2C (TWI) and SPI communication. The Arduino software includes a Wire library to simplify use of the I2C bus.

Figure 1 block diagram of vehicle detection using RFID reader

Arduino Uno: Arduino Uno is a microcontroller board based on 8-bit ATmega328P microcontroller. Along with ATmega328P, it consist other components such as crystal oscillator, serial communication, voltage regulator, etc. to support the microcontroller. Arduino Uno has 14 digital input/output pins (out of which 6 can be used as PWM outputs), 6 analog input pins, a USB connection, A Power barrel jack, an ICSP header and a reset button .Arduino can be used to communicate with a computer, another Arduino board or other microcontrollers. The ATmega328P microcontroller ISSN:0975-887

Figure 2 Arduinouno

RC-522 13.56 MHz RFID Reader This low cost MFRC522 based RFID Reader Module is easy to use and can be used in a wide range of applications. RC522 is a highly integrated transmission module for contactless communication at 13.56 MHz this transmission module utilizes an outstanding modulation and demodulation concept completely integrated for different kinds of contactless communication methods and protocols at 13.56 MHz. The MFRC522 is a highly integrated reader/writer IC for contactless communication at 13.56 MHz.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 70

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

DC MOTOR A DC motor is an electric motor that runs on direct current (DC) electricity. DC motors were used to run machinery, often eliminating the need for a local steam engine or internal combustion engine. DC motors can operate directly from rechargeable batteries, providing the motive power for the first electric vehicles. Today DC motors are still found in applications as small as toys and disk drives, or in large sizes to operate steel rolling mills and paper machines. Modern DC motors are nearly always operated in conjunction with power electronic devices. In any electric motor, operation is based on simple electromagnetism. A currentcarrying conductor generates a magnetic field; when this is then placed in an external magnetic field, it will experience a force proportional to the current in the conductor, and to the strength of the external magnetic field. The internal configuration of a DC motor is designed to harness the magnetic interaction between a current-carrying conductor and an external magnetic field to generate rotational motion.Every DC motor has six basic parts axle, rotor, stator, commutator, field magnet(s), and brushes. In most common DC, the external magnetic field is produced by high-strength permanent magnets. The stator is the stationary part of the motor this includes the motor casing, as well as two or more permanent magnet pole pieces. The rotor (together with the axle and attached commutator) rotates with respect to the stator. The rotor consists of windings (generally on a core), the windings being electrically connected to the commutator. ESP8266 WIFI MODULE The ESP8266 Wi-Fi Module is a selfcontained SOC with integrated TCP/IP protocol stack that can give any microcontroller access to your Wi-Fi network. The ESP8266 is capable of either hosting an application or offloading all Wi-Fi networking functions from another application processor. Each ESP8266 ISSN:0975-887

module comes pre-programmed with an AT command set firmware, meaning, you can simply hook this up to your Arduino device and get about as much Wi-Fiability as a Wi-Fi Shield offers (and that‘s just out of the box) The ESP8266 module is an extremelycost effective board with a huge, and ever growing, community. 32 This module has a powerful enough onboard processing and storage capability that allows it to be integrated with the sensors and other application specific devices through its GPIOs with minimal development up-front and minimal loading during runtime. Its high degree of on-chip integration allows for minimal external circuitry, including the front-end module, is designed to occupy minimal PCB area. The ESP8266 supports APSD for VoIP applications and Bluetooth co-existence interfaces, it contains a self-calibrated RF allowing it to work under all operating conditions, and requires no external RF parts. There is an almost limitless fountain of information available for the ESP8266, all of which has been provided by amazing community support. In the Documents section below you will find many resources to aid you in using the ESP8266, even instructions on how to transforming this module into an IOT (Internet of Things) solution. Specification:Hardware:Arduino UnoRFID sensor(MFRC522)MotorRelay Software:-Arduino IDE 1.6.8 6. CONCLUSION The project is helpful for the identification of the vehicle via RFID using IOT. Our project help in any stage of security domain system in residential buildings, colleges, schools, malls etc. when unauthorized vehicle pass through gate then RFID identify acc. to data which can stored , it will not open and when authorized vehicle near gate then it will open. All data which can going into the pc and for mobile the ESP8266 is connected to the Arduino. In mobile the authorized

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 71

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

and unauthorized vehicle id no. is sent via esp8266 Wi-Fi module which is connected to through internet.

[4]

REFERENCES [1] Prof. Kumthekar A.V. , Ms. SayaliOwhal,

[5]

[2]

[3]

Ms.SnehalSupekar, Ms. BhagyashriTupe ―International research journal of Engineering and technology‖ IRJET, (volume:05) April 2018. Liu Bin, Lu Xiaobo and GaoChaohui. Comparing and testing of ETC modes in Chinese freeway. Journal of Transportation Engineering and Information, 5(2), 2007, pp.31-35. A Novel Chipless RFID System Based on Planar Multiresonators for Barcode

ISSN:0975-887

[6]

Replacement StevanPreradovic, Isaac Balbin, Nemai C. Karmakar and Gerry Swiegers2008 Kashif Ali; HossamHassanein ―Passive RFID for Intelligent Transportation Systems‖: 2009 6th IEEE Consumer Communications and Networking Conference. Ivan Muller, Renato Machado de Brito, Carlos Eduardo Pereira, and ValnerBrusamarello. ‖Load cells in force sensing analysis theory and a novel application‖: IEEE Instrumentation & Measurement Magazine Volume: 13, Issue: 1 Muhammad Tahir Qadri, Muhammad Asif. ―Automatic Number Plate Recognition System for Vehicle Identification Using Optical Character Recognition‖: 2009 International Conference on Education Technology and Computer

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 72

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

WIRELESS COMMUNICATION SYSTEM WITHIN CAMPUS Mrs. Shilpa S. Jahagirdar1, Mrs. Kanchan A. Pujari2 1,2

Department of Electronics and Telecommunication, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. 1 [email protected] , [email protected]

ABSTRACT The system ―Wireless Communication System Within Campus‖ can be seen as a smaller version of smart campus. It is observed that providing the educational material, or important notices to students by faculties is still done though old methods like detecting notes in class or physical distribution, which is very time consuming. This important time of faculties as well as students can be saved through the use of technology and also this approach will be useful for students to acquire the important notices and required educational material. By making use of today‘s advanced electronic techniques and capabilities of smart phone‘s powerful processors and large memories, a system is designed to view important information by the students using an application by Wi-Fi without internet connectivity. This will help in better sharing and spread of important message or information amongst the campus students. The students will view or download required educational material and important message through the application.. Keywords wireless communication, smart phone, Wi-Fi, application • Every student or faculty in the college 1. INTRODUCTION In system that is typically followed in may not have access to the internet. colleges, the students and teachers have to • Excessive use of paper and other communicate everyday for many activities. resources. The notices, educational material or any As the electronic techniques advanced, other sort of information is required to be computing machines have been spread through either physical means or miniaturized and Smartphone are equipped internet access, this might consume a lot of with powerful processors and large effort as the paperwork is slow and also memories. In the consequence, various everyone at college may or may not have services become available on smart the privilege of internet access. The phones. Since a smart phone is a personal current process of information sharing has belonging it is an excellent candidate problems such as device on which a context-aware. services • Notices are shared on paper from class to are provided. As an example of contextclass which is time consuming. aware service on Smartphone, the campus • Searching backdated data might be guide is picked up and its implementation difficult. is introduced in this paper. The ―Wireless • Manual process is slower and may cause Communication Within Campus‖ consists error. of the server and client. The main features of the client include the required educational material and sharing of information and important important messages through the educational material between the client application. Use of same application can and server in android mobile phones. This be extended for faculties to control the will help in better sharing and spread of electrical appliances in department. The important message or information amongst application does not need internet access the campus students. The students can get hence no internet service is mandatory. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 73

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

The application only is needed to be connected to the raspberry pi WI-FI in the college or department premises. This system will ease the communication and data sharing without using the resources such as papers, manual effort, and internet connection. 2. MOTIVATION The present day other smart campus systems propose the application of knowing or measuring the area of a building or classrooms in a college etc. The other smart campus system proposes the application such that the location of a user using the android app in the college area or campus area can be known through the android app. The major issue in the college or a campus is the difficulty of data sharing amongst the students and the staff. While many of the users are not connected to the internet facility among the college hours and also the important notices are to be displayed on the notice board or shared from class to class increasing the manual effort. This process might be time consuming and also cause manual error and also is the problem of controlling the electric appliances in the classes where one has to go and manually switch on/off the appliances. In this paper, efforts are made to solve this issue by using the android app and raspberry pi module where a student can access the data sent by teacher on the WI-FI module and also the power control is added so that the electric appliances control can be done in the range of raspberry pi. 3. METHEDOLOGY The system works as a storage medium plus being WI-FI enabled which uploads the information on web server designed for this application. The uploaded file will be stored and can be viewed or downloaded using an android application. Faculties are able to turn ON/OFF the electric appliances like fan or lights remotely from server with the help of raspberry Pi and relay assembly. ISSN:0975-887

Figure 1: Block Diagram of the System

The system uses a raspberry pi 3 version as the heart of the system which looks after the whole communication in the system. In this the WI-FI of the raspberry pi will be used as a medium to connect the android apps. The program such as socket programming is used for the communication purpose where an app is designed in such a way that it can be accessed by an authorized person only. If a student has to access the app they will be given a separate password and ID (USER ID), and if a faculty has to access the app they will have a different password and ID (ADMIN ID). Thus, system also prevents the privacy of the users and miscommunication occurrence. The memory of the raspberry pi is used as a storage unit for the data being uploaded, this it will work as a cloud memory for the android app. The android application(App) have options such as upload, download, view, etc. The GUI design will be different for teachers and students based on their respective login as a faculty (ADMIN) or as a student(USER). This GUI is created using the eclipse software. The faculty also can control the electrical appliances of the department using their android application whereas student‘s login is not provided with this extra feature. This option is only provided in the faculty login GUI. 4. ALGORITHM Start

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 74

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

iv. Change the directory path to location at predefined path Set the direction of GPIO pin to output . Open the socket with fixed port number v. vi. To accept connections, the following steps are performed: 1. A socket is created with socket(). i. 2. The socket is bound to a local address using bind(), so that other sockets may bei. connected to it. 3. A willingness to accept incomingi. connections and a queue limit for incoming connections are specified withi. listen(). 4. Connect socket With connect() method i. 5. Connections are accepted with accept(). 6. Read the data on socket 1 byte 7. Convert that byte from ascii to int by atoi() function 8. Check the byte put into the switch case 9 if switch case 1: i. ii. iii. iv. v. vi. vii. viii. ix.

Now read the filename with fread() function and it will return the file content and length of the file name Write the file length and file content on socket Free the allocated memory in malloc() function

12. if switch case 4: Device 1will get turn ON

13. if switch case 5: Device 1will get turn OFF

14 .if switch case 6: Device 2will get turn ON

15. if switch case 7: Device 2will get turn OFF

16. if switch case 0: All devices are OFF

5. RESULTS Screen shots of various pages of the application are as follows,

Read file data from client and save on server First read file size Allocate memory to read filename using malloc() Now read actual file data Now read name of file, for that first read size of filename Allocate memory to read filename Now read actual file data Write file data into file Free the allocated memory in malloc() function

10. if switch case 2: i. ii. iii. iv. v. vi. vii.

Now read pathname, for that first read size of pathname Allocate memory to read pathname using malloc() Now read actual path data Now pass the directory path for list_dir() function From this function return value are file and directory listing and it‘s length Write the data on socket it‘s file and directory length and it‘s length Free the allocated memory in malloc() function

Figure 2: Screen shots1 of Android Application (LOGIN PAGE)

11. if switch case 3: i. ii. iii.

Now read file, for that first read size of filename Allocate memory to read pathname using malloc() Now read actual filename ISSN:0975-887

Figure 3: Screen shots2 of Android Application (CONFIGURATION)

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 75

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Figure 4: Screen shots3 of Android Application (MENU SELECTION)

6. APPLICATIONS This system is useful for easy communication between students and the faculty without the use of internet access and paper wastage. The system GUI is different from user to user depending upon their login as a faculty (ADMIN) or as a student (USER). The faculties with their own login can upload, download documents and can also operate the electric appliances in a particular classroom. Students can only view or download the required document. In current system, data is stored in memory of raspberry pi but in future this system can be made IOT based by storage of all data in cloud. 7. CONCLUSION After making survey of all existing different smart campus systems, the ―Wireless Communication System Within Campus‖ can be implemented. It gives

ISSN:0975-887

easy access to students for getting the required educational material and important notices by using an android application and also to download if required. All the devices are connected through WI-FI using an application on android phone. The system makes the task of sharing files and important data easy. Using the same application electric appliances are also controlled. So this system reduces the human effort of sharing the important notice from class to class or faculty to students and also helps in controlling electronic appliances from a distance instead of manually going to the place and switching it ON/OFF. It is also easier now to access the previously shared information. REFERENCES [1] Min Guo, Yu Zhang, The Research Of Smart Campus Based On Internet Of Things & Cloud Computing, sept. 2015. [2] Dhiraj Sunhera, Ayesha Bano, An Intelligent Surveillance with Cloud Storage for Home Security, Annual IEEE India Conference, 2014. [3] Xiao Nie, Constructing Smart Campus Based on the Cloud Computing Platform and the Internet of Things, 2nd International Conference on Computer Science and Electronics Engineering, 2013. [4] Suresh.S, H.N.S.Anusha, T.Rajath, P.Soundarya and S.V,Prathyusha Vudatha, Automatic Lighting And Control System For Classroom, Nov. 2016. [5] Piotr K. Tysowski, Pengxiang Zhao, Kshirasagar Naik, Peer to Peer Content Sharing on Ad Hoc Networks of Smartphones, 7th International Conference, July 2011. [6] Agus Kurniawan, ―Getting started with Raspberry Pi 3‖, 1st edition.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 76

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

LICENSE PLATE RECOGNITION USING RFID Vaibhavi Bhosale1 , Monali Deoghare2, Dynanda Kulkarni3, Prof S A Kahate 1,2,3

Department of Computer Engineering, Smt Kashibai Navale College of engineering, Vadgaon(Bk), Pune, India. [email protected],[email protected],[email protected], [email protected]

ABSTRACT

The objective of this project is to design an efficient automatic authorized vehicle identification system by using the vehicle number plate and RFID. The developed system firstly detects the vehicle RFID tag and then it captures the vehicle number plate. Here Vehicle number plate is detected by using the RFID tag situated on vehicle. And then resulting data is used to compare with the records on a database and data extracted from RFID Tag. And in database there can be specific information like vehicles owner name, place of registration, or address, etc. If the ID and the number are matches with the database then it show the message authorized person else unauthorized person. Both should be match with the database. If signal break any vehicles the RTO have authority to send the fine details by courier given address. 1.INTRODUCTION information of the vehicle and its owners. Robust and accurate detection and tracking This System can be implementing on tolls of moving objects has always been a to identify the theft vehicles, the RFID complex problem. Especially in the case of tags will help to identify the authorized owner of the vehicle that will provide outdoor video surveillance systems, the visual tracking problem is articularly security to society. The System robustness challenging due to illumination or and speed can be increased if high background changes, occlusions problems frequency readers is used. We will be able etc. to trace the vehicle moments if GPS is implemented and can extract the vehicles I. 3. STATE OF ART number.Identified owner will be sent an [1] The essentials of keystroke dynamics is SMS with the use of GSM module about not what you type, but how you type. In moments An algorithm for vehicle number this paper, it mainly presents our proposed plate extraction, character segmentation authentication system supporting with and recognition is presented. If vehicle keystroke dynamics as a biometric for break the signal then immediately send the authentication. We uses inter-key delays of report to the RTO center and RTO check the password and the account for user that vehicle details and apply the fine. Here Vehicle number plate is detected by identification in the system design. There are suggestions in the literature, that a using the RFID tag situated on vehicle. combination of key-hold time with the inter-key delay can improve the 2. MOTIVATION performance further. In Traffic surveillance, tracking of the vehicle is a vital job. We are proposing a [2] We propose to use RFID technology real time application which recognizes license plates from vehicles to track the to combine functions of physical access control, computer‘s access control and vehicle path using RFID tag and Reader. It management, and digital signature is very difficult to identify the lost vehicle systems. This combination allows to and also the vehicles which violate traffic drastically increase systems‘ security. rules. Therefore, it is necessary to detect Even low-end RFID tags can add one the number plate of the vehicle and use this detected number to track the security level into the system, but high-end ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 77

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

RFID tags with cryptographic possibilities and slight modification of digital signature calculation procedure make it possible to prevent obtaining digital signatures for fraudulent documents. The further evolution of the proposed scheme is permanent monitoring by means of periodical controlling user‘s RFID tag, whether authenticated user is present at the II. computer with restricted access.

used that carries the family member details and the customer needs to show this tag to the RFID reader. The microcontroller connected to the reader will checks for the user authentication. If the user is found authentic then the quantity of ration to be given to the customer according to the total number of family members will be displayed on display device. Proposed Work

[3] Mobile SNS is one of the most popular topics of mobile Internet. In order to fulfill the user demand for self-maintained independent social network and ensure the privacy of their personal information and resources, the paper proposes system architecture of decentralized mobile SNS. The mechanism and algorithm are devised for user profile complete deletion when users are going to quit the service for the temporary scenarios. [4] An encryption scheme for exchanging item level data by storing it in a central repository. It allows the data owner to enforce access control on an item-level by managing the corresponding keys. Furthermore, data remains confidential even against the repository provider. Thus we eliminate the main problem of the central approach. We provide formal proofs that the proposed encryption scheme is secure. Then, we evaluate the encryption scheme with databases containing up to 50 million tuples. Results show that the encryption scheme is fast, III. scalable and that it can be parallelized very efficiently. Our encryption scheme thereby reconciles the conflict between security and performance in item-level data repositories. [5] Developed a smart ration card using Radio Frequency Identification (RFID) technique to prevent the ration forgery as there are chances that the shopkeeper may sell the material to someone else and take the profit and put some false amount in their records. In this system, a RFID tag is ISSN:0975-887

Fig:Introduction to Proposed System

First Goal of this project is to modernize the present system and style of the new solutions for identi_cation and registration of vehicles supported RFID technology.Frequency identi_cation technology, as a result of contactless manner of identification of things and objects, provides higher and safer solutions, particularly in conjunction with a camera system. Advantages

In this project we have thought out of a system which is simple, cheap, reliable,and of course at least some fundamental advantages over the conventional automated systems. Here, micro controller controlled wireless communication system has been used,which makes the system not only automatic but also flexible 4. CONCLUSION WORK

AND

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

FUTURE

Page 78

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Here we conclude, the automatic vehicle identification system using vehicle license plate and RFID technology is presented. The system identifying the vehicle from the database stored in the PC. The objective of this project is to design an efficient automatic authorized vehicle identification system by using the vehicle number plate and RFID. The Automatic Number Plate Recognition (ANPR) system is an important technique, used in Intelligent Transportation System.ANPR is an advanced machine vision technology used to identify vehicles by their number plates without direct human intervention. The decisive portion of ANPR system is the software model.We also implemented further process if any vehicle break the signal then our system can detect that vehicle number tag and check details of that vehicle for applying fine to that vehicles.

Systems, 2015 IEEE International Conference on Systems, Man, and Cybernetics. [4] ehun-wei Tseng, Design and Implementation of a RFID-based Authentication System by Using Keystroke Dynamics. [5] Andrey Larchikov, Sergey Panasenko, Alexander V. Pimenov, Petr Timofeev,Combining RFID-Based Physical Access Control Systems with Digital Signature Systems to Increase Their Security.

REFERENCES [1] Hsiao-Ying Huang, Privacy by Region: Evaluation Online Users‘ Privacy Perceptions by Geographical Region, FTC 2016 - Future Technologies Conference 2016,6-7 December 2016. [2] Hyoung shick Kim, Design of a secure digital recording protection system with network connected devices, 2017 31st International Conference on Advanced Information Networking and Applications Workshops. [3] Chao-Hsien Lee and Yu-Lin Zheng, SQL-toNoSQL Schema Denormalization and Migration: A Study on Content Management

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 79

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

DATA ANALYTICS AND MACHINE LEARNING

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 80

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Online Recommendation System Swapnil N Patil1, Vaishnavi Jadhav2, Kiran Patil3, Shailja Maheshwari4 1

Asst. Professor, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. Asst. Professor, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected], [email protected], [email protected], [email protected]

2,3,4

ABSTRACT In today‘s world, everyone tends towards the internet. Usage of the internet is increasing day by day. Online shopping trend increases as internet usage increases. Online consumer reviews influence the consumer decision-making. End-user has seen the reviews of the product of the previous user and decides about good things and bad things. The Web provides an extensive source of consumer reviews, but one can hardly read all reviews to obtain a fair evaluation of a product or service. On the basis of this previous theory the process of computationally identifying and categorizing opinions expressed in a piece of text, especially in order to determine whether the writer's attitude towards a particular topic, product, etc. is positive, negative, or neutral. So, in this paper we are working on the sentiment analysis of that particular review and gives proper recommendation to end user. We are work on the supervised and unsupervised methodology. This system uses the real-time dataset of the review of the product. Keyword: Machine learning, Opinion mining, Statistical measures, Early reviewer, Early review. overall rating. The paper proposes a 1. INTRODUCTION system that can use this information from Nowadays if we want to purchase reviews to evaluate the quality of these something, we go online and search for products' aspects. Also, the proposed products and look for their reviews. A user system categorizes these aspects so that has to go through each and every review problem with different words for same for getting information regarding each and aspects can be resolved. These aspects are every aspect of product. Some of these identified using supervised and reviews contains large amount of text and unsupervised techniques. Then these detailed information about product and its identified aspects are categorized in aspects. A user may have to go through all categories. The sentiments or opinions of these reviews for help in decision user provided for particular aspect is making. Some of these products can have assigned to category of that aspect. Using large amount of reviews and can contain natural language processing techniques, information about its aspects in the form of the opinions are rated in the scale of 1 to 5. la6rge texts corpuses. A user might get These ratings are used to evaluate the irritated while reading all of these reviews quality of the products. and learn about the product. To avoid this, a system is needed that can analyze these 2. RELATED WORK reviews and detect the sentiments from Opinion Mining and Sentiment Analysis: these reviews for every aspect. Existing Opinion mining is a type of natural approaches fails to cover the fact if two language processing for tracking the mood reviews are mentioning same aspect with of the public about a particular product. two different words. Existing systems The paper focuses on designing and considers those as two different aspects. developing a rating and reviewAlso, the aspect wise information is not summarization system in 6a mobile preserved by these systems as they rely mostly on rating that is provided by environment. This research examines the influence of recommendations on different users for showing the quality or ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 81

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

consumer decision making during online shopping experiences. The recommender system recommends the products to users and to what extent these recommendations affect consumer decisions about buying products is analyzed in this paper. Comparison with the state-of-the-art for opinion mining is done by Horacio Saggion, et.al,2009, Ana-Maria Popescu and Oren Etzioni introduces an unsupervised information xtraction system which mines reviews in order to build a model of important product features, their valuation by reviewers, and their relative quality across products(Oren et. al., 2005). Early Adopter Detection An early adopter could refer to a trendsetter, e.g., an early customer of a given company, product and technology. The importance of early adopters has been widely studied in sociology and economics. It has been shown that early adopters are important in trend prediction, viral marketing, product promotion, and so on. The analysis and detection of early adopters in the diffusion of innovations have attracted much attention from the research community. Generally speaking, three elements of a diffusion process have been studied: attributes of an innovation, communication channels, and social network structures. Modeling Comparison-Based Preference By modeling comparison-based preference, we can essentially perform any ranking task. For example, in information retrieval (IR), learning to rank aims to learn the ranking for a list of candidate items with manually selected features. Distributed Representation Learning Since it's seminal work , distributed representation learning has been successfully used in various application areas including Natural Language Processing(NLP), speech recognition and computer vision. In NLP several semantic embedding models have been proposed, including word embedding, phrase embedding such as word2vec.In this paper we are using natural language processing ISSN:0975-887

for sentimental analysis for users' reviews. The user is giving negative, positive or neutral review is characterized by this sentimental analysis. The User Diagram:

Fig 1: Use case

The Sequence Diagram

Fig 2: Sequence Diagram

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 82

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. MOTIVATION We all use user's reviews for evaluating quality of product which we wish to purchase online. While looking for a particular feature of a product, user might look for one particular feature of that product. (Ex. Camera in phones) The products having good quality for that feature should be preferred in results. For this, detailed information about features is needed. And a system that can fetch this information from user reviews is needed. System Architecture: In our system firstly user will search the pro6duct and review that product according to t6he6m and using sentimental analysis on that review for generating rating system. If another user will view that product the review will help them.

Fig 3: System overview

Activity Diagram:

Fig:Activity Diagram

4. GAP ANALYSIS Sr. Year Author no Name

Paper Name

Paper Description

1.

2016

Julian McAuley, Alex Yang

Addressing Complex and Subjective ProductRelated Queries with Customer Reviews

‗Relevance‘ is measured in terms of how helpful the review will be in terms of identifying the correct response.

2.

2012

Ida Mele, Francesco Bonchi, Aristides

The Early-Adopter Graph and its Application to Web-Page Recommendation

By tracking the browsing activity of early adopters we can identify new interesting pages early, and recommend these

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 83

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Gionis

pages to similar users

3.

2012

Manuela Cattelan

Models for Paired Comparison Data: A Review with Emphasis on Dependent Data

4.

2010

Ee-Peng Lim,

Detecting Product Given that such labels do not Review Spammers using exist in the public, we thus Rating Behaviors decide to conduct user evaluation on different methods derived from the spamming behaviors proposed in this paper

Viet-An Nguyen, Nitin Jindal

5. CONCLUSION A system with two methods for detecting aspect categories that is useful for online review summarization is proposed. This system contains spreading activation to identify categories accurately. The system also weighs the importance of aspect. System can identify the sentiment for given aspect.Our experiments also indicate that early reviewers‘ ratings and their received helpfulness scores are likely to influence product popularity at a later stage. We have adopted a competitionbased viewpoint to model the review posting process, and developed a marginbased embedding ranking model for predicting early reviewers in a cold-start setting. 6. ACKNOWLEDGEMENT We express our gratitude to Prof Swapnil N Patil, for his patronage and giving us an opportunity to undertake this Project. We owe deep sense of gratitude to Swapnil Patil Sir for his constant encouragement, valuable guidance and support to meet the successful completion of my preliminary project report. We appreciate the guidance given by other ISSN:0975-887

There are other situations that may be regarded as comparisons from which a winner and a loser can be identified without the presence of a judge

supervisor as well as the panels especially in our project presentation that has improved our presentation skills thanks to their comment and advices. A special thanks to my teammates, who helped me prepare this report. Last but not the least, we extend my sincere thanks to my family members and my friends for their constant support throughout this project. REFERENCES [1] J. McAuley and A. Yang, ―Addressing complex and subjective product-related queries with customer reviews,‖ in WWW, 2016, pp. 625–635. [2] N. V. Nielsen, ―E-commerce: Evolution or revolution in the fastmoving consumer goods world,‖ nngroup. com, 2014. [3] W6. D. J. Salganik M J, Dodds P S, ―Experimental study of inequality and unpredictability in an artificial cultural market,‖ in ASONAM, 2016, pp. 529–532. [4] R. Peres, E. Muller, and V. Mahajan, ―Innovation diffusion and new product growth models: A critical review and research directions,‖ International Journal of Research in Marketing, vol. 27, no. 2, pp. 91 – 106, 2010. [5] L. A. Fourt and J. W. Woodlock, ―Early prediction of market success for new grocery products.‖ Journal of Marketing, vol. 25, no. 2, pp. 31 – 38, 1960.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 84

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[6] B. W. O, ―Reference group influence on product and brand purchase decisions,‖ Journal of Consumer Research, vol. 9, pp. 183– 194,1982. [7] J. J. McAuley, C. Targett, Q. Shi, and A. van den Hengel, ―Imagebased recommendations on styles and substitutes,‖ in SIGIR, 2015, pp. 43–52. [8] E. M.Rogers, Diffusion of Innovations. New York: The Rise of High-Technology Culture, 1983. [9] K. Sarkar and H. Sundaram, ―How do we find early adopters who will guide a resource constrained network towards a desired distribution of behaviors?‖ in CoRR, 2013, p. 1303. [10] D. Imamori and K. Tajima, ―Predicting popularity of twitter accounts through the discovery of link-propagating early adopters,‖ in CoRR, 2015, p. 1512.

ISSN:0975-887

[11] X. Rong and Q. Mei, ―Diffusion of innovations revisited: from social network to innovation network,‖ in CIKM, 2013, pp. 499–508. [12] I. Mele, F. Bonchi, and A. Gionis, ―The earlyadopter graph and its application to web-page recommendation,‖ in CIKM, 2012, pp.1682– 1686. [13] Y.-F. Chen, ―Herd behavior in purchasing books online,‖ Computers in Human Behavior, vol. 24(5), pp. 1977–1992, 2008. Banerjee, ―A simple model of herd behaviour,‖ Quarterly Journal of Economics, vol. 107, pp. 797–817, 1992. [14] A. S. E, ―Studies of independence and conformity: I. a minority of one against a unanimous majority,‖ Psychological monographs: General and applied, vol. 70(9), p. 1, 1956.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 85

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

INTELLIGENT QUERY SYSTEM USING NATURAL LANGUAGE PROCESSING Kshitij Ingole1, Akash Patil2, Kalyani Kshirsagar3, Pratiksha Bothara4, Prof. Vaishali S. Deshmukh5 1,2,3,4

5

Student, Smt. Kashibai Navale College Of Engineering,Vadgaon(bk),Pune-41 Asst. Professor, Smt. Kashibai Navale College Of Engineering,Vadgaon(bk),Pune-41

ABSTRACT We live in data driven world, where large amount of data is generated daily from various sectors. This data is stored in an organized manner in databases and SQL allows user to access, manage, process the data on the database. SQL is not easy for users who do not have any technical knowledge of databases. Intelligent Querying System (IQS) acts as an intelligent interface for database by which a layman or a person without any technical knowledge of databases can fire queries in natural language (English).This paper presents a technique for automatically generating SQL queries from natural language.In the proposed system input is taken in form of speech and the final output is generated after query is fired to the database. The process from taking speech input to obtained the final output is explained in this paper. keywordsDatabases,Natural Language Processing; system and database without having the 1. INTRODUCTION Use of Database is widespread. knowledge of the formal database query Databases have appli- cation in almost all languages. One of the major and information systems such as transport interesting challenge in the Computer information system, financial information Science is to design a model for system, human resource management automatically mapping natural lan- guage system etc. Intelligent interface to enhance semantics into programming languages. efficient interactions between users and For example, accessing a database and databases, is the need of the database extracting data from it requires the applications. Structured Query Language knowledge of Structured Query (SQL) queries get increasingly Language (SQL) and machine readable complicated as the size and the complexity instructions that common users have no in the relation among these entities knowledge of. Ideally, to access a database increase. These complex queries are very they should only ask questions in natural difficult to write for a layman or users language without knowing either the who do not have knowledge of the same. underlying database schema or any The main problem is that the users who complex machine language. Questions want to extract information or data from asked by the users in natural language the database do not have knowledge about form are translated into a statement/query the formal languages like SQL. The users in a formal query language. Once the proficient in SQL languages can access statement/query is formed, the query is the database easily but non- technical processed by the DBMS in order to users cannot. It is essential for the user extract the required data by the user. to know all the details of the database Databases are the common entities that are such as the structure of the database, processed by experts and with different entities, relations, etc. Natural language levels of knowledge. Databases respond interface to database presents an interface only to standard SQL queries which are for non-expert users to interact with the based on the relational algebra. It is ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 86

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

nearly impossible for a layman to be well versed in SQL querying as they may be unaware of the structure of the database namely tables, their corresponding fields and types, primary keys and so on. There is a need to overcome this gap of knowledge and allow users who have no prior knowledge of SQL, to query a database using a query posed in a natural language such as English. Providing a solution to this problem, this system has been proposed that uses natural language speech through voice recognition, converted to SQL query and displaying the results from the database. 2. MOTIVATION One of the most important aims of Artificial Intelligence is to make things easily and quickly accessible to humans. The access to information is invaluable and it should be available for everyone. Logically understanding the needs of the information that a person needs is quite easy to formulate and we do it frequently. However, one needs to have the knowledge regarding formal languages to access information from current systems and this hinders non- technical people from obtaining the information they want. It is crucial for systems to be user-friendly in order to obtain the highest benefits. These systems try to make information accessible to everyone who knows a natural lan- guage. The main motivation of proposed systems is to break the barriers for non-technical users and make information easily accessible to them. Making a user-friendly and more conversationally intelligent system will help the user and even nave users to perform queries without having actual knowledge of SQL or database schema. We aim to introduce a modular system to Query a database at any time without the hassle of logically forming the SQL constructs. For an instance consider the scenario of a hospital. Information of the patient is stored in the database. A ISSN:0975-887

doctor may not be well acquainted with the databases. Information retrieval hence becomes difficult for the doctor. It also acts as a learning tool for students, which help in the assessment of the SQL queries and learning through experience. The proposed system takes such problems into consideration and provides a solution to these problems. It makes access to data easier. With natural language as input and conversion of natural language to SQL queries, even nave users can access the data in the database. The advances in machine learning has progressively increased the reliability, usage, and efficiency of Voice to Text models. NLP has also seen major breakthroughs due to the of the Internet and Business Intelligence needs. Many toolkits and libraries exist for the sole purpose of performing NLP, this makes developing a system for easier and achievable. 3. STATE OF ART For the proposed system Intelligent Querying System using Natural Language Processing various papers have been reviewed whose survey report is given below. In [1] author has proposed an interactive natural language query interface for relational databases. Given a natural language query, the system first translates it to an SQL statement and then evaluates it against an RDBMS. To achieve high reliability, the system explains to the user how the query is actually processed. When ambiguities exist, for each ambiguity, the system generates multiple likely interpretations for the user to choose from, which resolves ambiguities interactively with the user. ‖The Rule based domain specific semantic analysis Natural Language Interface for Database‖ [2] converts a wide range of text queries (English questions) into formal (SQL query) ones that can then be run against a database by employing

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 87

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

generic and simpler processing techniques and methods. This paper defines the relation involving the ambiguous term and domain specific rules and with this approach this paper makes a NLIDB system portable and generic for smaller as well as large number of applications. This paper only focuses on context based interaction along with SELECT, FROM, WHERE and JOIN clauses of SQL query and also handles complex query that results from the ambiguous Natural Language query. In ‖Natural Language to SQL Generation for Semantic Knowledge Extraction in Social Web Sources ‖[3], a system is developed that can execute both DDL and DML queries, input by the user in natural language. A limited Data dictionary is used where all possible words related to a particular system are included. Ambiguity among the words is taken care of while process- ing the natural language. The system is developed in java programming language and various tools of java are used to build the system. An oracle database is used to store the information. The author has proposed a system in [4] which provides a convenient as well as reliable means of querying access, hence, a realistic potential for bridging the gap between computer and the casual end users. The system employs CFG based system which makes it easy search the terminals. As the target terminals become separated to many non-terminals. To get the maximum performance, the data dictionary of the system will have to be regularly updated with words that are specific to the particular system. The paper ‖An Algorithm for Solving Natural Language Query Execution Problems on Relational Databases‖ [5] showed how a modelled algorithm can be used to create a user friend non expert search process. The modularity of SQL conversion is also shown. The proposed model has been able to intelligently process users request in a ISSN:0975-887

reasonable human useable format. The limitations of the developed NLIDB, are as follows: 1. Domain Dependent. 2. Limited on Query Domain. In ‖System and Methods for Converting Speech to SQL‖ [6], author proposes a system which uses speech recognition models in association with classical rule based technique and semantic knowledge of underlying database to translate the user speech query into SQL. To find the join of tables the system uses underlying database schema by converting it into a graph structure. The system is checked for single tables and multiple tables and it gives correct result if the input query is syntactically consistent with the Syntactic Rules. The system is also database independent i.e. it can be configured automatically for different databases. 4. PROPOSED WORK There are many NLIDBs proposed in different papers but the interaction between the user and the system is missing.The proposed system tries to construct a natural language interface to databases in which the user can interact with the system and confirm that the the interpretation done by the system is correct or not and if any manual changes required can be done.The proposed System tries to build a bridge between the linguistics and artificial intelligence, aiming at developing computer programs capable of human like activity like understanding and producing text or speech in natural language such as English or conversion of natural language in text or speech from to language like SQL. The proposed system mainly works on three important steps that are 1. Speech to text conversion 2. SQL query gen- eration 3. Result generation. As displayed in fig.1 (flowchart) In the proposed system that is interactive query system using natural language processing the very first challenge is to convert the speech to text

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 88

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

format. This phase reduce the human effort to type the query or text. The result after conversion should not depend on the accent of speaking, voice of the user, etc. The speech to text conversion should be precise and should produce accurate result each time. As there can be ambiguity in every human speech, to interpret the proper speech is difficult part hence the edit option is also available so that if there is any change required in the machine interpretation of human speech the human can do that, this is done to reduce the further problems of misun- derstanding. This is done with the help of Google Speech Recognition. This requires an active internet connection to work. However, there are certain offline Recognition systems such as PocketSphinx, but have a very rigorous installation process that requires several dependencies. Google Speech Recognition is one of the easiest to use. The speech to text conversion requires an active internet connection. After the conversion of human speech to text the next challenge is to convert that text to sql query, using the accurate natural language processing algorithm the text is converted into the sql query, complex queries like joins must be converted properly. The system analyses and executes an NLQ in series of steps and at each stage the data is further processed to finally form a query. 1. Lowercase Conversion: The natural language query is then translated into lowercase. 2. Tokenization: The query after lowercase conversion is then converted into stream of tokens and a token id is provided to each word of NLQ. 3. Escape word removal: The extra/stop words are removed which are not needed in the analysis of a query. 4. Part of Speech Tagger: The tokens are then classified into nouns, pronouns, verb ISSN:0975-887

and string/integer variables. Considering the following sentence as input: How old are the students whose first name is Jean? The filter must return the elements: ‖age, pupil, first name, John‖. The order of the words is preserved and has its importance during next steps. 5. Relations-Attributes-Clauses Identifier: Now the system clas- sifies the tokens into relations, attributes and clauses on the basis of tagged elements and also separates the Integer and String values to form clauses. 6. Ambiguity Removal: It removes all the ambiguous attributes that exists in multiple relation with the same attribute name and maps it with the correct relation. 7. Query Formation: After the relations, attributes and clauses are extracted, the final query is constructed. after the query generation the generated query s fired to the database and the result is generated. The required result is extracted from the database and displayed. Considering the following sentence as input: ‖How old are the students whose first name is Jean?‖ Then the query generated for this input is: SELECT age FROM student WHERE firstname = ‘JEAN‘ The system architecture is shown in fig 2. In the proposed Intelligent Query system using natural language processing. The user is expected to give input in the form of speech. The Interactive system is developed .After taking input in the speech format the input is then given to the speech to text converter and communicator which converts it in the text form. The user can analyze the text and can update it manually if required. If there is any mistake by the machine to interpret to avoid the mistake, user has choice to edit it manually, so like this a

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 89

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

interactive system is been developed. There are various NLIDBs are developing but the proposed system provides the interaction between the user and machine which leads to less mistakes and misunderstandings. This natural language query is then converted into a stream of tokens with the help of tokenizer and a token id is provided to each word of the NLQ. Tokenization is the act of breaking

Fig. 1. System Architecture

up a sequence of strings into pieces such as words, keywords, phrases, symbols and other elements called tokens. Tokens can be individual words, phrases or even whole sentences. In the process of tokenization, some characters like punctuation marks are discarded.taking input in the speech format the input is then given to the speech to text converter and communicator which converts it in the text form. The user can analyze the text and can update it manually if required. If there is any mistake by the machine to interpret to avoid the mistake, user has choice to edit it manually, so like this a interactive system is been developed. There are various NLIDBs are developing but the proposed system provides the interaction between the user and machine which leads to less mistakes ISSN:0975-887

and misunderstandings. This natural language query is then converted into a stream of tokens with the help of tokenizer and a token id is provided to each word of the NLQ. Tokenization is the act of breaking up a sequence of strings into pieces such as words, keywords, phrases, symbols and other elements called tokens. Tokens can be individual words, phrases or even whole sentences. In the process of tokenization, some characters like punctuation marks are discarded.Then the parse tree is generated through the parser with the help of the token ids and a set of words is identified. The output of this analysis will be a collection of identified words. The set of identified words is then represented into a meaningful representation with the MR Generator. The identified words are transformed into structures that show how the words relate to each other.To find the relation between each tokenized word is important for query generation and that work is done by MR generator. The semantic builder takes the output generated by the MR generator and extracts the relevant attributes from the database. The relations between the word structures and the attributes extracted from the database are identified in the lexicon builder and relation identifier. The word structures and the attributes are mapped by identifying the relation between them and a semantic map is created. The SQL query is constructed with the help of the semantic map input to the query generator. This SQL query is then fired on the database. The output after the execution of the SQL query is then displayed to the user. 5. CONCLUSIONS AND FUTURE WORK

Intelligent Query System using Natural Language Processing is a system used for making data retrieval from database easier and more interactive.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 90

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Proposed system is bridging the gap between computer and casual user. Without any technical training handling databases is not possible for nave user. This drawback is overcome by this system This system converts the human speech input i.e. natural language input to the SQL query after converting the natural language to SQL query the generated query is given to database which gives the desired output. Though the basic idea of the system is not new and there are many more such systems have been developed in the past, this system tries to give more accurate results. As well as inner joins,aggregate functions are successfully implemented by this system. In proposed system, the process of natural language queries is independent of each other. Search is not often a single-step process. A user may ask follow-up questions based on the results obtained. It is thus necessary to provide a system to support a sequence of related queries. In the future, we would like to explore how to support follow-up queries, thereby allowing users to incrementally focus their query on the information they are interested in,

ISSN:0975-887

especially in interactions.

conversation-

like

REFERENCES [1] Fei Li, H.V. Jagadish, Constructing an interactive natural language interface for relational database Journal proceedings of VLDB en- dowment, vol. 8, Issue 01, Sept. 2014. [2] Probin Anand, Zuber Farooqui, Rule based Domain Specific Semantic Analysis for Natural Language Interface for Database International Journal of Computer Applications (0975 8887) Volume 164 No 11, April 2017. [3] K. Javubar Sathick, A. Jaya, Natural Language to SQL Generation for Semantic Knowledge Extraction in Social Web Sources Middle-East Journal of Scientific Research 22 (3): 375-384, 2014. [4] Tanzim Mahmud, K. M. Azharul Hasan, Mahtab Ahmed, A Rule Based Approach for NLP Based Query Processing Proceedings of In- ternational Conference on Electrical Information and Communication Technology (EICT 2015). [5] Enikuomehin A.O., Okwufulueze D.O, An Algorithm for Solving Natural Language Query Execution Problems on Relational Databases (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 3, No. 10, 2012. [6] Sachin Kumar, Ashish Kumar, Dr. Pinaki Mitra, Girish Sundaram, System and Methods for Converting Speech to SQL International Conference on Emerging Research in Computing, Information, Com- munication and Applications ERCICA 2013.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 91

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

MOOD ENHANCER CHATBOT USING ARTIFICIAL INTELLIGENCE Divya Khairnar1, Ritesh Patil2, Shrikant Tale3, Shubham Bhavsar4 1,2,3,4

Student, Smt.Kashibai Navale College of Engineering,Pune [email protected], [email protected], [email protected], [email protected]

ABSTRACT There are existing researches that attempt users for the psychiatric counseling with chatbot. They lead to the changes in drinking habits based on an intervention approach via chat bot. The existing application do not deal with the users psychiatric status and mood through the easy communications, frequent chat monitoring, and ethical citations in the intervention.. In addition, we will use image processing to detect mood of the user. We recommend a friendly chatbot for counseling that has adapted methodologies to understand counseling contents based on of high-level natural language understanding (NLU), and emotion recognition based on machine learning approach. The methodologies allows us to enable continuous observation of emotional changes sensitively.Pattern matching feature provided helps in communicating with the user via chatbot. General Terms Face Detection, Self Learning, Pattern Matching, Response Generation, Artificial Intelligence, Natural Language Processing, K-nearest neighbor. bot other than human who will keep their 1. INTRODUCTION This project emphasizes on providing thoughts safe. solutions to the user based on the mood 2. MOTIVATION recognition through face detection. Anxiety and depression are major issues Response generation by chatbot uses that are prevailing in our country. There machine leaning concepts to implement. are about 5.6 million people in India who Emotional recognition of human has been suffer from depression or anxiety. The a long research topic. Recently many excessive pressure of today‘s competitive studies show AI methods for adequate world and with fast growing lives and approach. In our model we have tried to changing environment conditions more make emotion recognition more easy that and more people are being prone to is via image processing. The service will depression. Anxiety is defined as ―a first capture human image and recognize feeling of worry, nervousness, or human emotion by studying the image. uneasiness‖. With the addiction of social The chatbot will suggest videos and other media and competition there are more entertainment activities based on users number of cases of teens committing of mood and chat accordingly. At the end suicide. This is because of insecurity, fear there will be an analysis of the user. This of separation, low self-esteems and many service will be mainly helpful to the more. Mental health is not taken seriously. people who are depressed and are not If not treated at the right time this may confident enough to share their feelings lead to severe depression. Thus people with other human beings. It make much need to understand the importance of more easier to share ones feeling with a mental health care. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 92

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. LITERATURE SURVEY Emotion Based Mood Enhancing Music Recommendation, 2017 ‖ proposed system „EmoPlayer‟, is an android application which minimize efforts by suggesting user a list of songs based on his current emotions[1].A Chatbot for Psychiatric Counseling in Mental Healthcare Service Based on Emotional Dialogue Analysis and Sentence Generation, 2017‖ proposed system enables continuous observation of emotional changes sensitively[2]. A Novel Approach For Medical Assistance Using Trained Chatbot, 2017 proposed system can predict diseases based on symptoms and give the list of available treatments[3]. A Study On Facial Components Detection Method For Face Based Emotion Recognition, 2014‖, proposed system for facial component detection method for face-based emotion recognition[6]. The Chatbot Feels You- A Counselling Service Using Emotional Response Generation, 2017, proposed system to introduce novel chatbot system for psychiatric counselling service[5]. 4.GAP ANALYSIS Existing system only recommends us music as a response on the basis of mood[1]. Machine Learning concepts such as Self Learning and Pattern Matching are not used in the proposed model. Only feature provided is music recommendation. This system do not provide way of communicating with the user. Proposed system will recommend not only music but will also interact with the user. Machine learning concepts such as self learning along with pattern matching will be used. Proposed system will suggest music as well as motivational videos, jokes, meditation etc.

ISSN:0975-887

SR. NO.

PAPER

1

A Chatbot for Psychiatric Counseling in Mental Healthcare Service Based on Emotional Dialogue Analysis and Sentence Generation.[1]

2

The chatbot feels you- A counseling Service using Emotion Response Generation[2]

Efficient use of pattern matching, RNN performs better with human interaction.

Only uses NLP, storage limitation due to use of RDBMS.

3

A novel approach for medical assistance using trained chatbot[3]

Age based medicine dosage details, easy to use due to JSON docs, cross platform compatibility.

No real time monitoring of users, accuracy cannot be guaranteed.

4

Chatbot Using A Knowledge in Database Human-toMachine Conversation Modeling[4]

Implementation of pattern matching, use of AIML.

Use of bigram, storage limitation due to use of RDBMS.

Use of haar cascade algorithm.

Requires a lot of sample images, hence more storage required.

5

Emotion based mood enhancement [5]

ADVANTAGES

Free counseling, Implementation of morpheme embedding.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

DISADVAN TA-GES

RNN cannot track long term dependency, huge amount of training data required.

Page 93

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

5. PROPOSED WORK In this paper, we have proposed a system ―TALK2ME‖ which is an desktop based application. This application will detect human emotions based on the images clicked by the application. It will gather random images of various people with different emotions to train our model. By studying these images, model can classify various human emotions such as happy, sad, angry, depressed, etc precisely. System will respond to the user according to identified emotion. System will also suggest songs and videos to user for users mood enhancement. System will keep track of various emotions of the user and will generate graphs according to that. 1. System detects user emotions using machine learning algorithms such as haarcascade.[1] 2. Random images will be used to train our models. System makes use of machine learning to learn new things based on past inputs. This system uses k-means clustering algorithm to form clusters of sentences or words that have similar meanings.The following figure correctly depicts the architecture of the proposed system along with its components.

Fig.System Architecture

ISSN:0975-887

3. The best fit model is selected for our predictions. System will respond to user according to users emotion. It will not only chat with user but also will recommend him/her motivational videos, songs and other entertainment stuff. Suppose user is sad we will recommend motivational videos to the user which will enhance users mood or the application will generate a playlist which will consist of songs which will boost users emotions. If the user is depressed the system will boost up user emotions by suggesting videos which will increase users confidence. System enables pattern matching feature. The user input will be matched with the existing data in the database and reply according to the users requirement. System uses KNN algorithm for pattern matching. 6.CONCLUSION AND FUTURE WORK Integrating chatbots into the employee development and training process can go a long way in boosting the productivity of the employees.A human emotion recognizing chatbot application is still in its early days,but if used promptly by human resources,it is sure that it will enhance the ever growing industry of artificial intelligence. An emotion based chatbot will surely help in medical fields if it is deployed with utmost priority to security concerns. REFERENCES [1] P. Belhumeur, J. Hespanha, and D. Kriegman, ―Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,‖ IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 19, No. 7, pp. 711-720, 1997. [2] Ashleigh Fratesi, ―Automated Real Time Emotion Recognition using Facial Expression Analysis‖, Master of Computer Science thesis, Carleton University [3] Mudrov´a, M, Proch´azka, A, ―Principal component analysis in image processing‖, Department of Computing and Control Engineering, Institute of Chemical Technology'.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 94

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[4] Paul Viola and Michael J. Jones, ―Robust realtime object detection‖, International Journal of Computer Vision, Vol. 57, No. 2, pp.137–154, 2004. [5] SayaliChavan, EktaMalkan, Dipali Bhatt, Prakash H. Paranjape, ―XBeats-An Emotion Based Music Player ‖, International Journal for Advance Research in Engineering and Technology, Vol. 2, pp. 79-84, 2014. [6] Xuan Zhu, Yuan-Yuan Shi, Hyoung-Gook Kim and Ki-Wan Eom, ―An Integrated Music Recommendation System‖ IEEE Transactions on Consumer Electronics, Vol. 52, No. 3, pp. 917-925, 2006. [7] Dolly Reney and Dr.NeetaTripaathi, ―An Efficient Method to Face and Emotion Detection‖, Fifth International Conference on Communication Systems and Network Technologies, 2015

ISSN:0975-887

[8] Lucey, P., Cohn, J. F., Kanade, T., Saragih, J., Ambadar, Z., & Matthews, I. (2010). The Extended Cohn-Kanade Dataset (CK+): A complete expression dataset for action unit and emotion-specified expression. Proceedings of the Third International Workshop on CVPR for Human Communicative Behavior Analysis (CVPR4HB 2010), San Francisco, USA, 94101. [9] R.Cowie E.Douglas-Cowie.N.Tsapatsoulis.G. Votsis. S. Koilias.W.Fellenz.Emotion Recognition in Human Computer Interaction. IEEE Signal Processing Magazine 18(01).32 80. 2001. [10] O.Martin.I.Kotsia.B.Macq.I.Pitas,The eNTERFACE 05 Audio-visual Emotion Database,In:22nd International Conference on Data Engineering workshops Atlanta.Atlanta.GA.USA.2006.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 95

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

MULTISTAGE CLASSIFICATION OF DIABETIC RETINOPATHY USING CONVOLUTIONAL NEURAL NETWORKS Aarti Kulkarni1, Shivani Sawant2, Simran Rathi3, Prajakta Puranik4 1,2,3,4

Computer Engineering Department , Smt. Kashibai Navale College of Engineering, Pune- 411041 [email protected], [email protected], [email protected], [email protected]

ABSTRACT Diabetic Retinopathy (DR) is a diabetes complication that affects the eye causing damage to the blood vessels of retina. The progressive effect of it may lead to complete blindness. It has shown progressive effects on people especially in India. The screening of such disease involves expansive diagnosis measures which are meagre. To overcome this situation, this paper proposes a software-based system for ophthalmologists that facilitates the stage wise classification of Diabetic Retinopathy. Convolutional Neural Network facilitates the stage-based classification of DR by studying retina images known as Fundus Images. These images are classified by training the network based on their features. With increasing number of Diabetic Retinopathy patients, the need for the automated screening tools becomes indispensable. This application will help ophthalmologists to quickly and correctly identify the severity of the disease. General Terms - deep learning; computer vision Keywords- diabetic retinopathy; image classification; deep learning; convolutional neural network; transfer learning These reasons contribute to the difficulties 1. INTRODUCTION DR is recognized by the presence of in gradeability of the image. This makes symptoms includ-ing micro-aneurysms, the automation of the DR system hard exudates and haemorrhages. These necessary. The fundus images obtained via symptoms have been aggregated into five public datasets consist of some categories according to the expertise of irregularities which need to be corrected ophthalmologist which are as follows: prior to feeding it to the CNN. The images Stage 1: No apparent retinopathy, Stage 2: are pre-processed to obtain normalization Mild None-Proliferative DR(NPDR), throughout the dataset. CNN is Stage 3: Moderate NPDR, Stage 4: Severe implemented via the Transfer Learning NPDR, Stage 5: Proliferative DR [1][2]. A algorithm which fa-cilitates the use of prerecent nation-wide cross-sectional study of trained models allowing the network to Diabetic patients by the AIOS, reiterated classify the labelled dataset into the the findings of earlier regional studies required five classes. It demonstrates the which concluded the prevalence of effectiveness of the method for DR image Diabetic Retinopathy in India on a large recognition and classification. The use of scale. The existing methods for Transfer Learning for CNN guarantees classification and analysis of DR face high accuracy results within the limited certain issues. The rarity of systematic DR time constraints. It makes the system screening methods is one of the major robust and removes the constraints of the causes. Also, acquisition of good quality quantity and quality of data. retinal images possesses a challenge [3]. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 96

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

2. MOTIVATION Among individuals with diabetes, the prevalence of diabetic retinopathy is approximately 28.5 percent in the United States and 18 percent in India [4]. Earlier methods for detecting DR include manual interpretation and repeat examinations. This is time consuming and can delay the prognosis which may lead to severe complications. Automated grading of diabetic retinopathy has potential benefits such as increasing efficiency, coverage of screening programs and improving results by providing early detection and treatment. 3. LITERATURE SURVEY Kanungo et al. proposed a CNN model built around the Inceptionv3architecture. The architecture basically acts as multiple convolution filter inputs that are processed on the same input. It also does pooling at the same time. All the results are then concatenated. This allows the model to take advantage of multi-level feature extraction from each input. Problem of overfitting could be reduced [5]. Fitriati et al. proposed an implementation of Diabetic retinopathy screening using real time data. Extreme Learn-ing Machine is used as classification method for binary classification of DR stages. RCSM and DiaretDB0 datasets were used for training and testing. While DiaretDB0 achieved high training accuracy, it failed to perform in testing. The model performed poorly for both training and testing of RCSM. Introduction of robust predictive and recognition model like CNN could improve the performance. [6]. Yu et al. proposed a Convolutional Neural Network for Exudate detection ISSN:0975-887

for diabetic retinopathy. The output obtained for this model is labeling of textural feature called Exudates in retina. The performance measures indicated high performance but accuracy rate can be improved by increasing the training data size. The well-trained CNN model can also be leveraged for multi stage classification of DR [7]. Bui et al proposed architecture of neural network for detection of cotton wools in retinal fundus images. Feature extraction can be improved by introduction of Convolu-tional layer. Accuracy rate can also be improved by training a CNN over a traditional neural network [8]. Padmanabha et al. proposed the implementation of SVM to perform binary classification of DR. Preprocessing tech-niques like adaptive histogram equalization and segmentation of blood vessels were implemented. This enabled extraction of textural features of the entire retinal region. Although it obtained binary classification of DR, multi-stage classification and the scope of better feature extrac-tion can be achieved in future [9]. Wang et al. proposed the implementation of transfer learn-ing by comapring three pre-trained Convolutional Neural Network architectures to perform five stage classification of DR. Image preprocessing technique implemented was noise reduction. Although it obtained multistage classification of DR, increasing the training data size would improve DR categorization accuracy further [10]. 4. PROPOSED WORK Dataset Kaggle dataset provides a large set of Fundus images taken under a variety of

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 97

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

imaging conditions. Images are labelled with a subject ID as well as their orientation [11]. The images have been labelled on a scale of 0 to 4 where 0 is no DR and 4 is proliferative DR. The dataset consists of 35,126 training images divided into 5 category labels and 10715 test images which are 20 percent of the total test dataset. The dataset is a collection of images with different illumination, size and resolution, every image needs to be standardized. Initially all images are resized to standard dimensions. Dataset images are RGB colour images consisting of red, green and blue channels out of which the green channel is used to obtain the best results in contrast of blood vessels. This is depicted in Fig. 1.

channels of the output equal to the number of features. The correspondence of the feature detectors with the required output is very large which may lead to overfitting. To avoid this, the parameters to be trained on the network can be fixed by computing the dimensions of the filters and the bias, such that it does not depend on the size of the input image.Each layer outputs certain values by convoluting the input and the filter. Non-linearity activation functions are applied to the output to achieve the final computations.

Method

Convolutional Neural Networks: Convolutional Neu-ral Network is a deep learning based artificial neural network that is used to classify images, cluster them by similarity and perform object recognition. It is used to detect features according to the respective classification of the images. The number of features to be detected directly corresponds to the filters used. The filters are treated as small windows of the required dimensions which are convoluted with the matrix of the input image pixels. Vertical and horizontal feature detectors are implemented with the number of ReLU is used to compute the nonlinearity functions which can be implemented by simply thresholding a matrix of activations at zero. These computations involve taking the activations from the previous one layer of ISSN:0975-887

Fig. 1: Image Pre-processing

Fig. 2: Transfer Learning for CNN

the network to the activations of the next layer. The convolution layer is paired with a pooling layer which is used to reduce the size of representation and to speed up computation although this layer has nothing to learn.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 98

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

The use of CNN is proposed for particularly two reasons: parameter sharing and sparsity of the connections. Parameter sharing is feasible as the number of parameters can be fixed by using feature detectors which can be applied multiple times in various regions in case of a very large image. Sparsity of connections benefits the network as not all nodes of every layer have to be connected to each other. Transfer Learning: Transfer Learning is implemented in CNN to lower computational cost and save time. It comprises the use of a trained network and using it for another task of a different design. The modification in the training of the layers can be done as per the features that are to be detected. The re-training of the necessary layers can be done by freezing the layers and the implementation of an alternate SoftMax layer to simplify the output of the network in the desired number of classes. DR diagnosis is done by using an image dataset which consist of a large number of images. Since the number of images available for training is high, many layers can be re-trained. The last few layers are re-trained by adding new hidden units, a SoftMax layer which will give output in the required five classes which correspond to the stages of the disease. Leveraging the trained units of another network allows for better cost and time for large datasets. Fig 2. depicts the flow of the model including Transfer Learning in CNN. Outcome The classification of DR is done in five stages according to the symptoms associated with the fundus images. Ev-ery stage is recognized by a set of particular symptoms. The appropriate image preprocessing techniques aid in achieving ISSN:0975-887

higher accuracy results of classification. CNN-based transfer learning methodology would result in better performance of DR classification task as the pre-trained model accurately classifies the low-level visual patterns in the images. The expected results of this automated system will help in the accurate diagnosis of DR and the obtained results will enable ophthalmologists to correctly recommend the appropriate treatment. The results of the au-tomated system are obtained within limited time constraints and this proves to be beneficial as manual processes often take a day or two for evaluating the severity of the disease leading to miscommunication and delayed treatment. 5. CONCLUSION AND FUTURE WORK The exponential growth of this disease in people created an alarmed situation as people were not able to receive timely treatment. Generally, the testing of the patient and the analysis of the report takes a lot of time without the guarantee of accurate results. To reduce this problem this system is designed. Diabetic Retinopathy is classified into five stages, corresponding to the symptoms. The stage wise classification helps to analyze the severity of the treatment. The use of Deep Learning methods has become very popular in all applications due to its self learning aspect. Transfer Learning based Convolutional Neural Network reduces the learning time period of the system and also guarentees the high accuracy results. It makes the system robust and removes the constraints of the quantity and quality of data. Future Work The classification of Diabetic Retinopathy is done using Fundus Eye Images as the input image data. Along with this, OCT images could also be used to identify and classify the disease. OCT images are eye images that are also used for retinal scanning. The use of these images also for the identification and classification of the disease will expand

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 99

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

the scope of the diagnosis. It will also allow the model to learn better during the training phase. Diabetic Retinopathy is one of the diseases that affect people all over the world. The success of such a system for the classification of Diabetic Retinopathy provides the scope for building such a system for various other diseases that need accurate results and within short period of time. Convolutional Neural Network is a very powerful network which can be further used for extended analysis of various other diseases. REFERENCES [1] pp. C. P.1677-1682. Wilkinson, F. retinopathy L. Ferris, R. Klein, P. A. P. Lee, Kampik, and G.C. D.D. R. R.Pararajasegaram, Agardh, P. Group, M. Proposed Davis, J. E. T. D. international Verdaguer, Dills, clinical macular diabetic edema disease severity and diabetic scales, Ophthalmology, vol. 110, issue 9, Sep. 2003,

[1] T. Y. Wong, C. M. G. Cheung, M. Larsen, S. Sharma, and R. Sim, Diabetic retinopathy, Nature Reviews Disease Primers, vol. 2, Mar. 2016, pp. 1-16. [2] Gadkari SS. Diabetic retinopathy screening: Telemedicine, the way to go!. Indian J Ophthalmol 2018;66:187-8 [3] Gulshan V, Peng L, Coram M, et al. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA. 2016;316(22):24022410. doi:10.1001/jama.2016.17216 [4] 801-804. Y. Choudhary, S. deep Kanungo, ‖Detecting B. Bangalore, diabetic Srinivasan retinopathy andIEEE S. using International Conference learning,‖ on 2017 Recent 2nd Trends in Electronics, Technology (RTEICT), and Com-munication 2017, pp. doi:Information 10.1109/RTEICT.2017.8256708 [5] D. Fitriati and A. Murtako, ‖Implementation of Diabetic Retinopathy screening using realtime data,‖ 2016 International Conference on Informatics and Computing (ICIC), Mataram,

ISSN:0975-887

2016, pp. 198-203. doi: 10.1109/IAC.2016.7905715 [6] S. Yu, D. Xiao and Y. Kanagasingam, ‖Exudate detection for diabetic retinopathy with convolutional neural networks,‖ 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Seogwipo, 2017, pp. 17441747. doi: 10.1109/EMBC.2017.8037180 [7] T. Bui, N. Maneerat and U. Watchareeruetai, ‖Detection of cotton wool for diabetic retinopathy analysis using neural network,‖ 2017 IEEE 10th International Workshop on Computational Intelligence and Applications (IWCIA), Hiroshima, 2017, pp. 203-206. doi: 10.1109/IWCIA.2017.8203585 [8] A. G. A. Padmanabha, M. A. Appaji, M. Prasad, H. Lu and S. Joshi, ‖Classification of diabetic retinopathy using textural features in retinal color fundus image,‖ 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), Nanjing, 2017, pp. 1-5. doi: 10.1109/ISKE.2017.8258754 [9] X. Wang, Y. Lu, Y. Wang and W. Chen, ‖Diabetic Retinopathy Stage Classification Using Convolutional Neural Networks,‖ 2018 IEEE International Conference on Information Reuse and Integration (IRI), Salt Lake City, UT, 2018, pp. 465-471. doi: 10.1109/IRI.2018.00074 [10] https://www.kaggle.com/c/diabeticretinopathy-detection/dat.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 100

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

PREDICTING DELAYS AND CANCELLATION OF COMMERCIAL FLIGHTS USING METEOROLOGICAL AND HISTORIC FLIGHT DATA Kunal Zodape1, Shravan Ramdurg2, Niraj Punde3, Gautam Devdas4, Prof. Pankaj Chandre5, Dr. Purnima Lala Mehta6 1,2,3,4

Student, Department of Computer Engineering SKNCOE,Savitribai Phule Pune University Pune, India 5 Asst. Professor, Department of Computer Engineering SKNCOE,Savitribai Phule Pune University Pune, India 6 Department of ECE, HMR Institute of Technology and Managmenet Delhi [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT Flight delays are a problem which has reached its pinnacle in the recent times. These delays are primarily caused due to seasonal upsurges in number of commuters or meteorological interferences. Airline companies suffer through economical issues such as reimbursement costs, arrangement of accommodations and latent issues like damages to the brand value and depreciated public image. By introducing a predictive model the airline companies can help in the planning and logistics operations by taking preemptive measures. Commuters can use this information to mitigate the consequences of flight delays. In this paper we propose the use of boosting methods to improve the performance of classifiers by tweaking weak learners in the favour of those instances which were misclassified in the previous iterations. The model is built using various statistical techniques based on the stochastic distribution of trends in the datasets. Keywords: Predictive Analysis, Machine Learning, Supervised Learning, Data Mining capacity and Airport, Airline choice in 1. INTRODUCTION With the concept of machine multi-airport regions and delay learning fueled by the upsurge in the propagation[5][6][7]. processing power of the underlying We intend to predict flight delays hardware we have been able to apply using historic flight and meteorological complex mathematical computations to big data as features. Due to flight delays and data iteratively and automatically in a cancellations, many airline customers reasonable time on modern computers. On suffer from complications in their business the other hand, data mining involves data or travel schedules. Furthermore, airlines discovery and data sorting among large have to pay hefty amounts for the data sets available to identify the required reimbursements, accommodation charges patterns and establish relationships with and may miss critical business deadlines the aim of solving problems through data which could result in loss of revenue, analysis[1]. Previous attempts for solving further damaging the quality and this problem involve usage of techniques reputation [8]. Machine learning such as Artificial Neural Networks, algorithms can assist the passengers by Gaussian processes, Support Vector reducing the inconveniences caused by Machines[2][3][4]. Previous works also delays and cancellations and help the involve considering factors like Airport airlines save on the reimbursements and ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 101

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

improve their quality by being better prepared for similar anomalies in the future. 2. MOTIVATION ● ● Every day almost 2.2 million people willingly board commercial airlines despite the fact that around 850,000 them would not get to their desired destination on time [9]. ● Roughly 40 percent of all air travelers have arrived late consistently for most of the last 35 years [10] . And unless things change dramatically, about 40 percent of all the air travelers will continue to arrive late every year, perhaps forever. ● A 40 percent failure rate would be unacceptable for the global commercial passenger flight network and acts as a● bottleneck for various business and travel related activities along with air cargo delivery operations. ● Using historic flight data and meteorological data of the source and destination airports as the major attributes this paper cyphers this problem using various machine learning algorithms in order to gauge the feasibility of different algorithms and choose the most accurate one for prediction. 3. LITERATURE SURVEY This section provides information about the previous work done for addressing the problem of flight delay ● prediction. ● Airline Delay Predictions using Supervised Machine Learning Pranalli Chandraa, Prabakaran.N and Kannadasan.R, VIT University, Vellore. This paper uses preliminary data analysis techniques and data cleaning to remove noise and inconsistencies. The machine learning techniques used are multiple linear regression and polynomial regression which allow for various metrics of bias and variance in order to pinpoint the best fitting parameters for the respective models. K-fold method is used ISSN:0975-887

for cross validation of the intermediate models and RMSE and Ecart metrics gauge their performance. The implementation is carried out in Python 3. Review on Flight Delay Prediction Alice Sternberg, Jorge Soares, Diego Carvalho, Eduardo Ogasawara This paper proposes a taxonomy and consolidates the methodologies used to address the flight delay prediction problem, with respect to scope, data, and computing methods, specifically focusing on the increased usage of machine learning methods. It also presents a timeline of significant works that represent the interrelationships between research trends and flight delay prediction problems to address them. A Deep Learning Approach to Flight Delay Prediction Young Jin Kim, Sun Choi, Simon Briceno and Dimitri Mavris This paper uses deep learning models like Recurrent Neural Networks and long short-term memory units along with RNN. Deep learning is suitable for learning from labelled as well as unlabelled data. It uses multiple hidden layers to improve the learning process and can accelerated using modern GPUs. Deep learning tries to mimic the learning methodologies of biological brain (mainly human brain). This paper comments on effectiveness of various deep learning models for predicting airline delays A statistical approach to predict flight delay using gradient boosted decision tree Suvojit Manna, Sanket Biswas, Riyanka Kundu, Somnath Rakshit, Priti Gupta This paper investigates the effectiveness of the algorithm Gradient Boosted Decision Tree, one of the famous machine learning tools to analyse those air traffic data. They built an accurate and robust prediction model which enables an elaborated analysis of the patterns in air traffic delays.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 102

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

4. GAP ANALYSIS This section provides the comparison drawn between the paper

previously published addressing the flight delay prediction problem.

Table 1: GAP ANALYSIS

Sr. no.

Paper Title

Year

Algorithms used

Results Obtained

1

Airline Delay Predictions using Supervised Machine Learning

2018

Linear Regression

Flight Delay Prediction Analysis

2

A Review in Flight Delay Prediction

2017

KNN, Fuzzy Logic, Random Forest

Taxonomy and summarized initiatives to address flight delay prediction problem

3

A Deep Learning Approach to Flight Delay Prediction

2016

Recurrent Neural Networks

Improved accuracy in flight delay prediction

4

A Statistical approach to predict Flight Delay using Gradient Boosted Decision Tree

2017

Gradient boosted decision trees

Prediction model enabling an elaborated analysis of patterns in air traffic delays

5. PROPOSED WORK The proposed predictive model initially undergoes three data preprocessing techniques which consists of: ● Filling in missing values ● Alternative values for crucial cells ● Merging Climatological data with flight data ● Dimensionality Reduction

parameters which are ultimately used for prediction calculation. Datasets and Sources The U.S. Department of Transportation's Bureau of Transportation Statistics tracks the on-time performance of domestic flights operated by large air carriers.

Data Preprocessing is followed by designing the prediction engine and building the learning model using different boosting techniques for producing Learned

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 103

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Table 2: Flight Dataset Attributes

Sr. no.

Flight dataset attributes

Sample Data Values / Description

1

AIRPORT

ABQ, BLI, DHN

2

CITY

Albuquerque, Waco, Nantucket

3

STATE

NM, TX, PA

4

COUNTRY

USA

5

DATE

dd-mm-yyyy format

6

FLIGHT_NUMBER

Flight Identifier

7

ORIGIN_AIRPORT

Starting Airport

8

COUNTRY

USA

9

DESTINATION_AIRPORT

Planned Destination

10

SCHEDULED_DEPARTURE

Planned Departure Time

11

DEPARTURE_TIME

Actual Departure Time

Local Climatological Data (LCD) consist of hourly, daily, and monthly summaries for approximately 1,600 U.S.

locations. Provided in the public domain via the US National Oceanic and Atmospheric Administration.

Table 3: Meteorological Dataset Attributes

Sr. Sample Data Meteorological no. Values / dataset attributes Description

ISSN:0975-887

1

DATE

Given in serial numbers

2

TEMPERATURE - MIN, AVG, MAX

Min, Avg, Max temperature in fahrenheit

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 104

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3

SUN RISE(UTC), SET(UTC)

Given in terms of ISO 8601 without seconds

4

WEATHER

Alpha numerical weather type identifiers

5

PRECIPITATION Given in terms of inches for snow and - RAIN, SNOW rainfall

6

PRESSURE

Given in terms of inch of mercury i.e hg

7

WIND SPEED LOW, HIGH

Low and high wind speed given in terms of km /hr

8

WIND DIRECTION LOW, HIGH

Low and high wind speed given in terms of degrees

5.2

System Overview This section provides the architectural overview of the proposed system highlighting the processing workflow.

5.3 Data Preparation and Preprocessing This section lists some heuristics for preparing and preprocessing the data before building the learning model. 5.3.1

Filling in missing values This step deals with the missing values in the dataset by filling them with an unique identifier. For example if DEPARTURE_TIME is absent then the empty cell is to be filled with an unique identifier. 5.3.2

Fig. 1: System Overview

ISSN:0975-887

Data Discretization/Binning Attributes with a continuous distribution of values can be classified using the process of data discretization or binning, in order to create discrete class of ordinal values to provide an optimised environment for the learning process.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 105

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

5.3.3 Merging Meteorological data with flight data Merging meteorological data from National Oceanic and Atmospheric Administration with historical flight data procured from Bureau of Transportation Statistics. 5.3.4 Dimensionality Reduction Dimensionality reduction helps in reducing the complexity of the dataset by merging correlated attributes and creating more generalized attributes that facilitates faster computations during the learning process. Due to the lower dimensionality of the resultant dataset, data visualization and analysis becomes more concise.

AdaBoost (Adaptive boosting) was developed for efficiently boosting binary classifiers. AdaBoost is adaptive because previous weak learners are tweaked in the favor of those instances which are misclassified [11]. AdaBoost is sensitive to noisy data and outliers, hence high quality training set is required to counteract this. The most commonly used algorithm with AdaBoost is decision trees with one level, which are also known as decision stumps. The process of adding weak learners is continued, till no further improvements can be made or until the threshold number of weak learners is achieved[12].

5.4 Model Building 5.4.1 Boosting Ensemble Method

Fig. 2: Boosting

These are boosting methods which work on weak classifiers to improve the performance using additive learning which learns basically through improving the previously built models. The main methodology used in this, is to build a model from training data and creating a second model that corrects the errors of the previously built model . 5.4.2 AdaBoost Method ISSN:0975-887

Fig. 3: AdaBoosting

AdaBoost puts more weight on the instances that are difficult to classify, rather than instances that are easily classified. AdaBoost is less susceptible to over-fitting the training data. Strong classifier can be built by converging individual weak learners.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 106

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

5.4.3 Gradient Boosting Gradient boosting is a highly popular technique for building predictive models. Gradient boosting can be expressed as an optimization problem to minimise a loss function by combining multiple weak submodels using a procedure similar to gradient descent. This method overcomes the limitations imposed by AdaBoost by expanding the scope of boosting to support regression and multiclass classification. Gradient boosting involves three elements: Loss Function: A differentiable metric that needs to be minimised in order to fine tune the model. It can be calculated using a variety of methods like squared sum (RMSE), eCart or logarithmic loss. Weak Learner: Decision trees are used as weak learners in gradient boosting method. The trees can be constrained in multiple ways like depth limiting or branch limiting. In case of AdaBoost, the decision trees are usually comprised of a single split (Decision stumps), whereas in gradient boosting, trees can have 4 to 8 levels. Additive Model: The weak classifiers are combined by using the weights of the submodels as the parameters in a gradient descent procedure for minimising the loss function. 5.4.4 Stochastic Gradient Boosting In this method, base learner should be fit on a subsample of the training set drawn at random without replacement, at each iteration. Due to this, a significant improvement is observed in the performance of the models built using gradient boosting[12]. Using this model proves beneficial as it reduces correlations between the submodels by greedily selecting the most informative trees in a stochastic manner. 6. SYSTEM COMPARISON ISSN:0975-887

There has been several attempts to apply the various supervised or unsupervised machine learning algorithms to the predict delays and cancellations in commercial airlines. For instance, in the paper A Statistical approach to predict Flight Delay using Gradient Boosted Decision Tree [13], the algorithm used was Gradient Boosting. The comparison between above Adaboost, Gradient Boosting and Stochastic boosting algorithm is given as follows: Adaboost and Gradient boosting differ on how they create the weak learners during the iterative process. Adaptive boosting changes the sample distribution at each iteration by modifying the weights attached to each of the instances. It favours the misclassified data points by increasing the weights and similarly decreases the weights of the correctly classified data point. Thus weak learner is trained to classify more difficult instances. After training, the weak learner is added to the strong one according to its performance. The higher its performance, the more it contributes to the strong learner. On the contrary, Gradient boosting does not modify the sample distribution. Instead of performing training on a new sample distribution, the weak learner trains on the remaining errors of the strong learner. In each iteration, the mismatched data points are calculated and a weak learner is fitted to these mismatched data points of the strong learner. Whereas, in stochastic gradient boosting at each iteration a subsample of the training data is drawn at random (without replacement) from the full training dataset. The randomly selected subsample is then used, instead of the full sample, to fit the base learner. A few variants of stochastic boosting like subsample rows before creating each tree, subsample columns before creating each tree and subsample columns before considering each split can be used. Thus, the proposed algorithm Stochastic Gradient Boosting exhibits

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 107

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

better performance than the previously implemented Adaboost and Gradient Boosting.

[6]

[7] 7. CONCLUSION In a generalized manner, this paper have shown that prediction of delays in commercial flights is tractable and that local weather data at the origin airport is indeed essential for the prediction of delays. In the case of flight delays or cancelation, the most significant real world factors are combination of technical and logistical issues. The datasets considered in the paper do not provide this aspect of data thus the accuracy of the model model is restrained by this limitation.

[8]

[9]

[10]

REFERENCES. [1] Belcastro, Loris, et al. "Using Scalable Data [11] Mining for Predicting Flight Delays." ACM [2]

[3] [4] [5]

Transactions on Intelligent Systems and Technology (TIST) 8.1 (2016) Khanmohammadi, Sina, Salih Tutun, and Yunus Kucuk. "A New Multilevel Input Layer Artificial Neural Network for Predicting Flight Delays at JFK Airport." Procedia Computer Science 95 (2016): 237244. Hensman, James, Nicolo Fusi, and Neil D. Lawrence. "Gaussian processes for big data." CoRR,arXiv:1309.6835 (2013) Bandyopadhyay, Raj, and Guerrero, Rafael. "Predicting airline delays." CS229 Final Projects (2012). Gilbo, Eugene P. "Airport capacity: Representation, estimation, optimization." IEEE Transactions on Control Systems Technology 1.3 (1993): 144154.

ISSN:0975-887

[12]

Tierney, Sean, and Michael Kuby. "Airline and airport choice by passengers in multi airport regions: The effect of Southwest airlines." The Professional Geographer 60.1 (2008): 1532. Schaefer, Lisa, and David Millner. "Flight delay propagation analysis with the detailed policy assessment tool." Systems, Man, and Cybernetics, 2001 IEEE International Conference on . Vol. 2. IEEE, 2001. Guy, Ann Brody."Flight delays cost $32.9billion". http://news.berkeley.edu/2010/10/18/flight_dela ys. ―Airlines' 40% Failure Rate: 850,000 Passengers Will Arrive Late Today -- And Every Day‖ https://www.forbes.com/sites/danielreed/2015/0 7/06/airlines-40-failure-rate-850000passengers-will-arrive-late-today-and-everyday/#2d077c1074bd Hansen, Mark, and Chieh Hsiao. "Going south?: Econometric analysis of US airline flight delays from 2000 to 2004." Transportation Research Record: Journal of the Transportation Research Board 1915 (2005): 8594. Robert E. Schapire.― Explaining AdaBoost ―.Princeton University, Dept. of Computer Science, 35 Olden Street, Princeton, NJ 08540 USA, e-mail: [email protected] Jerome H.Friedman. ―Stochastic gradient boosting‖. Department of Statistics and Stanford Linear Accelerator Center, Stanford University, Stanford, CA 94305, USA

[13] Suojit Manna ,Sanket Biswas.‖A Statistical approach to predict Flight Delay using Gradient Boosted Decision Tree‖.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 108

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

A SURVEY ON RISK ASSESSMENT IN HEART ATTACK USING MACHINE LEARNING Rahul Satpute1, Atharva Dhamrmadhikari2, Irfan Husssain3, Prof. Piyush Sonewar4 1,2,3

Student, Department of Computer Engineering Smt. Kashibai Navle College of Engineering, Pune. India 4 Asst. Professor, Department of Computer Engineering Smt. Kashibai Navle College of Engineering, Pune. India [email protected], [email protected], [email protected]

ABSTRACT Acute Myocardial Infarction (Heart Attack), a Cardiovascular Disease (CVD) leads to Ischemic Heart Disease (IHD) is one of the major killers around the world. A proficient approach is proposed in this work that can predict the chances of heart attack when a person is bearing chest pain or equivalent symptoms. We will developed a prototype by integrating clinical data collected from patients admitted in different hospitals attacked by Acute Myocardial Infarction (AMI). 25 attributes related to symptoms of heart attack are collected and analyzed where chest pain, palpitation, breathlessness, syncope with nausea, sweating, vomiting are the prominent symptoms of a person getting heart attack. The data mining technique naïve bayes classification is used to analyze heart attack based on training dataset. This technique will increase the accuracy of the classification result of heart attack prediction. A guiding system to suspect the chest pain as having heart attack or not may help many people who tend to neglect the chest pain and later land up in catastrophe of heart attacks is the most interesting research area of researcher's in early stages. Keywords: Acute Myocardial Infarction (Heart Attack), Cardiovascular Disease (CVD), Ischemic Heart Disease (IHD), Naïve Bayes Classification. excluding heart attack of the chest pain 1. INTRODUCTION someone is suffering from. This will lead Acute myocardial infarction, to early prediction of heart attack leading commonly referred to as Heart Attack is the most common cause for sudden deaths to early presentation to and evaluation by a in city and village areas. Detecting heart doctor and early treatment. Chest pain is attack on time is of paramount importance the most common and significant symptom as delay in predicting may lead to severe of a heart attack, although, some other damage to heart muscle, called features are also liable to have heart attack. myocardium leading to morbidities and In this era, modern medical science has mortalities. Even after having severe and been enriched with many modern unbearable chest pain, the person may technology and biological equipment that neglect to go to a doctor due to several reduce the overall mortality rate greatly. reasons including his professional reasons, But cardiovascular disease (CVD), cancer, personal reasons or just overconfidence chronic respiratory disease and diabetes that they how they can have heart attack. are becoming fatal at an alarming rate. Many times, people do not realize that the Predicting heart attack on time is of chest pain they are suffering from may be paramount importance as delay in a heart attack and lead to death as they are detecting may lead to severe damage to not educated on the subject. heart muscle, called myocardium leading When mobile phone is one of the most to morbidities and mortalities. Acute widely used technology nowadays, myocardial infarction occurs when there is developing an application to predict the a sudden, complete blockage of a coronary episode of heart attack will yield artery that supplies blood to an area of productive results in diagnosing of ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 109

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

heart also known as Heart Attack. A blockage can develop due to a buildup of plaque, a substance mostly made of fat, cholesterol and cellular waste products. Due to an insufficient blood supply, some of the heart muscles begin to die. Without early medical treatment this damage can be permanent. Medical sector is rich with information but the major issues with medical data mining are their volume and complexity, poor mathematical categorization and canonical form. We have used advanced data mining techniques to discover knowledge from the collected medical datasets. Reducing the delay time between onset of a heart attack and seeking treatment is a major issue. Individuals who are busy in their homes or offices with their regular works and rural people having no knowledge on the symptoms of heart attack may neglect the chest discomfort. They may not have exact intention to neglect it but they may pass on the time and decided to go to a doctor or hospital after a while. But for heart attack, time matters most. There are many Mobile Health (Health) tools available to the consumer in the prevention of CVD such as self monitoring mobile apps. Current science shows the evidence on the use of the vast array of mobile devices such as use of mobile phones for communication and feedback, Smartphone apps. As medical diagnosis of heart attack is important but complicated and costly task, we will proposed a system for medical diagnosis that would enhance medical care and reduce cost. Our aim is to provide a ubiquitous service that is both feasible, sustainable and which also make people to assess their risk for heart attack at that point of time or later Problem Statement of

Reliable identification and classification cardiovascular diseases requires

ISSN:0975-887

pathological test, namely, Blood test, ECG and analysis by experienced pathologists. As it involves human judgment of several factors and a combination of experiences, a decision support system is desirable in this case. The proposed problem statement is ―Risk Assessment in Heart Attack using machine learning Motivation Acute myocardial infarction, commonly referred to as Heart Attack is the most common cause for sudden deaths in city and village areas. It is one the most dangerous disease among men and women and early identification and treatment is the best available option for the people. 2. RELATED WORK Nearest neighbor (KNN) is very simple, most popular, highly efficient and effective technique for pattern recognition. KNN is a straight forward classifier, where parts are classified based on the class of their nearest neighbor. Medical data bases are big volume in nature. If the data set contains excessive and irrelevant attributes, classification may create less accurate result. Heart disease is the best cause of death in INDIA. In Andhra Pradesh heart disease was the best cause of mortality accounting for 32%of all deaths, a rate as high as Canada (35%) and USA. Hence there is a need to define a decision support system that helps clinicians to take precautionary steps. In this work proposed a new technique which combines KNN with genetic technique for effective classification. Genetic technique perform global search in complex large and multimodal landscapes and provide optimal solution [1]. This work focuses a new approach for applying association rules in the Medical Domain to discover Heart Disease Prediction. The health care industry collects huge amount of health care data which, unfortunately are not mined to discover hidden information for effective decision making. Discovery of hidden

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 110

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

patterns and relationships often goes unexploited. Data mining techniques can help remedy this situation. Data mining have found numerous applications in Business and Scientific domains. Association rules, classification, clustering are majorareas of interest in data mining. [2]. This work has analyzed prediction systems for Heart disease using more number of input attributes. The work uses medical terms such as sex, blood pressure, cholesterol like 13 attributes to predict the likelihood of patient getting a Heart disease. Until now, 13 attributes are used for prediction. This research work added two more attributes i.e. obesity and smoking. The data mining classification algorithms, namely Decision Trees, Naive Bayes, and Neural Networks are analyzed on Heart disease database [3]. Medical Diagnosis Systems play important role in medical practice and are used by medical practitioners for diagnosis and treatment. In this work, a medical diagnosis system is defined for predicting the risk of cardiovascular disease. This system is built by combining the relative advantages of genetic technique and neural network. Multilayered feed forward neural networks are particularly adapted to complex classification problems. The weights of the neural network are determined using genetic technique because it finds acceptably good set of weights in less number of iterations [4]. A wide range of heart condition is defined by thorough examination of the features of the ECG report. Automatic extraction of time plane features is valuable for identification of vital cardiac diseases. This work presents a multiresolution wavelet transform based system for detection 'P', 'Q', 'R', 'S', 'T' peaks complex from original ECG signal. 'R-R' time lapse is an important minutia of the ECG signal that corresponds to the heartbeat of the related person. Abrupt ISSN:0975-887

increase in height of the 'R' wave or changes in the measurement of the 'R-R' denote various anomalies of human heart. Similarly 'P-P', 'Q-Q', 'S-S', 'T-T' also corresponds to various anomalies of heart and their peak amplitude also envisages other cardiac diseases. In this proposed method the 'PQRST' peaks are marked and stored over the entire signal and the time interval between two consecutive 'R' peaks and other peaks interval are measured to find anomalies in behavior of heart, if any [5]. The ECG signal is well known for its nonlinear changing behavior and a key characteristic that is utilized in this research; the nonlinear component of its dynamics changes more automatically between normal and abnormal conditions than does the linear one. As the higherorder statistics (HOS) maintain phase information, this work makes use of onedimensional slices from the higher-order spectral region of normal and ischemic subjects. A feed forward multilayer neural network (NN) with error back propagation (BP) learning technique was used as an automated ECG classifier to find the possibility of recognizing ischemic heart disease from normal ECG signals [6]. Automatic ECG classification is a showing tool for the cardiologists in medical diagnosis for effective treatments. In this work, propose efficient techniques to automatically classify the ECG signals into normal and arrhythmia affected (abnormal) parts. For these categories morphological features are extracted to illustrate the ECG signal. Probabilistic neural network (PNN) is the modeling technique added to capture the distribution of the feature vectors for classification and the performance is calculated. ECG time series signals in this work are bind from MIT-BIH arrhythmia database [7]. The heart diseases are the most extensive induce for human dying. Every

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 111

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

year, 7.4 million deaths are attributed to heart diseases (cardiac arrhythmia) including 52% of deaths due to strokes and 47% deaths due to coronary heart diseases. Hence identification of different heart diseases in the primary stages becomes very important for the protection of cardiac related deaths. The existing conventional ECG analysis methods like, RR interval, Wavelet transform with classification algorithms, such as, Support Vector machine KNearest Neighbor and Levenberg Marquardt Neural Network are used for detection of cardiac arrhythmia Using these techniques large number of features are extracted but it will not identify exactly the problem [8].

3. PROPOSED SYSTEM We will propose a novel Heart attack prediction mechanism is proposed which first learns deep features and then trains these learned features. Experimental results show the classifier outperforms all other classifiers when trained with all attributes and same training samples. It is also demonstrated that the performance improvement is statistically significant. Prediction of heart attack using a low population, high dimensional dataset is challenging due to insufficient samples to learn an accurate mapping among features and class labels. Current literature usually handles this task through handcrafted feature creation and selection. Naïve baiyes is found to be able to identify the underlying structure of data compare to other techniques. Proposed System Architecture

Fig: Proposed System Architecture

4. MATHEMATICAL MODEL Mathematical equation in Naive-Bayes Classification: It gives us a method to calculate the conditional probability, i.e., the probability of an event based on previous knowledge available on the events. Here we will use this technique for heart disease prediction i.e. classification based on conditional probability. More formally, Bayes' Theorem is stated as the following equation:

Let us understand the statement first and then we will look at the proof of the statement. The components of the above statement are: P (A/B): Probability (conditional probability) of occurrence of event A given the event B is true. I.e. the probability of heart check up attributes. P(A) and P(B): Probabilities of the occurrence of event A and B respectively which is the probability of heart check up attributes P(B/A): Probability of the occurrence of event B given the event A is true. More ever probability of heart check up

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 112

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

attributes to predict the actual heart disease. 5. ALGORITHM Naive Bayes algorithm is the algorithm that learns the probability of an object with certain features belonging to a particular group/class. In short, it is a probabilistic classifier. The Naive Bayes algorithm is called "naive" because it makes the assumption that the occurrence of a certain feature is independent of the occurrence of other features. Here we classify the heart disease based on heart check up attributes. Naive Bayes or Bayes‘ Rule is the basis for many machine learning and data mining methods. The rule (algorithm) is used to create models with predictive capabilities. It provides new ways of exploring and understanding data. Why to prefer naive bayes implementation: 1)When the data is high. 2)When the attributes are independent of each other. 3) When we expect more efficient output, as compared to other methods output. Based on all these information and steps we classify to predict the heart disease depending on heart check up attributes. 6. CONCLUSION In this work we have presented a novel approach for classifying heart disease. As a way to validate the proposed method, we will add the patient heart testing result details to predict the type of heart disease using machine learning. Train data sets taken from UCI repository. Our approach use naïve bayes technique which is a competitive method for classification. This prediction model helps the doctors in efficient heart disease diagnosis process with fewer attributes. Heart disease is the most common contributor of mortality in India and in Andhra Pradesh. Identification of major risk factors and developing decision support system, and effective control measures and health ISSN:0975-887

education programs will decline in the heart disease mortality. REFERENCES [1] Algorithm M.Akhil jabbar B.L Deekshatulua Priti Chandra International ―Classification of Heart Disease Using K- Nearest Neighbor and Genetic Algorithm‖ Conference on Computational Intelligence: Modeling Techniques and Applications (CIMTA) 2013. [2] MA.Jabbar, B.L Deekshatulu, Priti Chandra, ―An evolutionary algorithm for heart disease prediction‖CCIS,PP 378-389 , Springer(2012). [3] Chaitrali S Dangare ―Improved Study Of Heart Disease Prediction System Using Data Mining ClassificationTechniques‖, International Journal Of Computer Applications, Vol.47, No.10 (June 2012). [4] Amma, N.G.B ―Cardio Vascular Disease Prediction System using Genetic Algorithm‖, IEEE International Conference on Computing, Communication and Applications, 2012. [5] Sayantan Mukhopadhyay1 , Shouvik Biswas2 , Anamitra Bardhan Roy3 , Nilanjan Dey4‘ Wavelet Based QRS Complex Detection of ECG Signal‘ International Journal of Engineering Research and Applications (IJERA) Vol. 2, Issue 3, May-Jun 2012, pp.2361-2365 [6] Sahar H. El-Khafifand Mohamed A. ElBrawany, ―Artificial Neural Network-Based Automated ECG Signal Classifier‖, 29 May 2013. [7] M.Vijayavanan, V.Rathikarani, Dr. P. Dhanalakshmi, ―Automatic Classification of ECG Signal for Heart Disease Diagnosis using morphological features‖. ISSN: 2229-3345 Vol. 5 No. 04 Apr 2014. [8] I. S. Siva Rao, T. Srinivasa Rao, ―Performance Identification of Different Heart Diseases Based On Neural Network Classification‖. ISSN 0973-4562 Volume 11, Number 6 (2016) pp 3859-3864. [9] J. R. Quinlan, Induction of decision trees, Machine learning, vol. 1, no. 1, pp.81106, 1986. [10] J. Han, J. Pei, and M. Kamber, Data mining: concepts and techniques. Elsevier,2011. [11] I. H. Witten, E. Frank, M. A. Hall, and C. J. Pal, Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016. [12] L. Breiman, Random forests, Machine learning, vol. 45, no. 1, pp. 532, 2001. [13] Mullasari AS, Balaji P, Khando T." Managing complications in acute myocardial infarction." J Assoc Physicians India. 2011 Dec; 59 Suppl(1): 43-8.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 113

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[14] C. Alexander and L. Wang, Big data analytics in heart attack prediction,J Nurs Care, vol. 6, no. 393, pp. 21671168, 2017.

ISSN:0975-887

[15] Wallis JW. Use of arti_cial intelligence in cardiac imaging. J Nucl Med. 2001 Aug; 42(8): 1192-4.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 114

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Textual Content Moderation using Supervised Machine Learning Approach Revati Ganorkar1, Gaurang Suki2, Shubham Deshpande3, Mayur Giri4, Deshmukh5

Araddhana

1,2,3,4

Student, Department of Computer Science, Smt. Kashibai Navale College of Engineering., Pune 5Aarhus University, Herning, Denmark [email protected], [email protected], [email protected], [email protected], , [email protected]

ABSTRACT By the increasing use of Social Networking Sites, a huge amount of data is generated on daily basis. This data contains a plethora of hate speech and offensive content which makes a negative impact on society. Various tech giants such as Facebook[1] and Microsoft have been using manual content moderation techniques on their website. But even this has a negative effect on content moderators reviewing content across the world. In order to tackle this issue, we have proposed an efficient automated textual content moderation technique which uses supervised machine learning approach. KEYWORDS Social Networking Sites, content moderation, hate speech, offensive words, text classification the contents they publish to avoid judicial 1. INTRODUCTION Social Networking Sites have gained a claims. This work proposes the use of considerable amount of popularity in automatic textual classification techniques recent years. It has totally changed to identify and only allow to go online people‘s way of communication and harmless textual posts and other content. sharing of information. Different sites use different methods to People use different means for moderated the textual content. SNS like communication (Example: text messages, Facebook[1], Twitter[2] manually images, audio clips, video clips, etc) This moderate the content whereas Linkedin[3] information shared on social networking automatically removes the content after sites may contain some data which might reported by a certain number of users. But, be offensive to some people. Also, the manual moderation of content requires shared media may contain some illegal manpower and the moderators have to go information which can spread the wrong through a lot of mental stress while message in the society. moderating the data. Some of the cases where moderators suffered from extreme In [4], authors have observed that the stress are discussed here. In [5] content increase in the use of social media and moderators alleged Facebook[1] that it Web 2.0 are daily drawing more people to failed to keep its moderators safe as they participate and express their point of views developed post-traumatic stress and about a variety of subjects. However, there psychological trauma from viewing are a huge number of comments which are graphic images and videos. offensive and sometimes non-politically correct and so must be hindered from In another incident [6], two employees at coming up online. This is pushing the Microsoft filed a lawsuit against Microsoft services providers to be more careful with as they were forced to view content that ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 115

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

inhumane which led to severe posttraumatic stress disorder. Thus, manual moderation of abusive content is malicious for the person moderating the content as it causes harmful effects on them. Therefore there is a need for an efficient technique to monitor hate speeches and offensive words on social networking sites. 2. LITERATURE SURVEY In [7], the paper includes moderation of multimodal subtleties such as images or text. The authors develop a deep learning classifier that jointly models textual and visual characteristics of pro-eating disorder content that violates community guidelines. For analysis, they used a million photos, posts from Tumblr. The classifier discovers deviant content efficiently while also maintaining high recall (85%). They also discuss how automation might impact community moderation and the ethical and social obligations of this area. In [8], the proposed system is designed for open source operating system windows or Linux. The implementation of this system is based on PHP framework. MySQL database is used for storing the datasets by configuring the LAMP server in Ubuntu and WAMP server in windows. Also the configuration of PHPMyAdmin. Ubuntu helps to perform various tasks such as creating, modifying or deleting databases with the use of a web browser. Dream viewer is being used for the system development. For recommendation generation, latest version of Apache is used. To configure Wamp with windows environment the integration of Wamp server in windows is done. To make the Web environment scalable it is being ISSN:0975-887

integrated with PHP and Wamp. Initially, for the testing purpose, a Phase one development is being established on localhost. In [9], various techniques applied regarding with data processing, such as weighting of terms and the dimensionality reduction. All these techniques were studied in order to model algorithms to be able to mimic well the human decisions regarding the comments. The results indicate the ability to mimic experts decision on 96.78% in the data set used. The classifiers used for comparison of the results were the K-Nearest Neighbors and the Covalent Bond Classification. For dimensionality reduction, techniques for the extraction of terms were also used to best characterize the categories within the data set. As SNSs have become of paramount relevance nowadays, many people refuse to participate in or join them because of how easy it is to publish and spread content that might be considered offensive. In [4], the approach accurately identifies inappropriate content based on accusers‘ reputations. Analysis of reporting systems to assess content as harmless or offensive in SNSs. 3. GAP ANALYSIS Not all the data generated from SNS can be considered as normal. It contains a considerate amount of data that can be considered as offensive and hateful. Manual content moderation is effective but requires a considerate amount of manpower and sometimes it can be traumatic for humans to examine such inappropriate content. Hence, in recent days some organizations have come up with effective techniques which can be

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 116

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

used for filtering inappropriate content. Following table summarizes all the different techniques used by different organizations to rectify this illegal data. Reporting Systems

Automatic vs human intervention

Udd

Hate reports are automatically filtered

Work, Blue and hoffman

Content withdrawn depending on owner‘s reputation

Facebook

Manual review of content on social media

Linkedin

Automated withdrawal after reported by fixed no. of user.

Twitter

Manual review of content on social media and also uses automated data.

Table 1: Content Moderation Techniques Most of the organizations manually monitor the content. Because of this people are exposed to offensive content which sometimes can be hostile for the person monitoring the data and can cause mental stress. There is a need for a system that will automatically monitor offensive content and reduce the manual workload. Thus, we are proposing a system which will automatically monitor SNS for malign content with the help of machine learning. 4. PROPOSED SYSTEM Automatic content moderation can be achieved with the help of traditional natural language processing techniques ISSN:0975-887

coupled with supervised classification learning. Using the association between these two methods, the model for offensive and hateful text detection is proposed. The proposed model is designed to achieve more efficiency in illegal text classification performance. The main aim of the proposed model is to eliminate the need for manual content moderation. This can be effectively achieved by utilizing techniques of natural language processing and machine learning that when trained with appropriate data, predicts a nearly accurate outcome. The proposed model is composed of the following core components as shown in Figure 1. 1. Natural Language Processing:- It is responsible of taking textual data as input and apply series of natural language processing techniques so that it can be processed by text classifier. Here, sentences are filtered and converted into a vector of numbers. 2. Training:- Twitter corpus is given to Natural Language Processing component which converts it into a set of vectors. These vectors and pre-assigned labels are used for construction and training of the classifier model. The model obtained is then improved with parameter tuning. The parameter tuning method used here is 10-fold cross-validation. 3. Classifier model:- During training, classifier model is constructed from the vectorized sentences prepared by Natural Language Processing component and label (Offensive/Normal) which are already present in the dataset. Further, this trained classifier model is used for predicting a given sentence whether it‘s offensive or not. Classifier predicts the outcome

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 117

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

accurately and precisely. For this purpose, 2 algorithms are compared for their

classification

performance.

Figure 1: Proposed Architecture

Tweets contain unnecessary data such as stop words, emojis, usernames. This kind of data does not contribute much in the classification and hence, we need to filter out this data as well as normalize it into a suitable format so that it can be used for training the classifier for classifying the unknown text data. An Individual tweet is taken and is then tokenized into words. These tokens are then used to determine unnecessary data such as emoji and usernames. Furthermore, unnecessary symbols and stopwords are removed in order to reduce the data volume. The main task is to normalize the data. Hence the aim is to infer the grammar independent representation of a given tweet. Lemmatization is used to find out the the lemma of each token. After this, all the filtered tokens for one tweet are collected together for further processing. ISSN:0975-887

The vectorization algorithm used in the proposed model is TF-IDF vectorization. The reason to choose this particular vectorization technique is that the dataset used for the experimentation a contains large number of tweets containing offensive words which dominate the small number of regular tweets. As TF-IDF assigns the score depending upon the occurrence of a term in a document, this seems to be the best choice. The classifier model is then trained on a collection of pairs containing vectorized tweets and whether they are offensive or not. Supervised classification is used in this proposed system is able to then learn from these tweets and can classify a new tweet. After training, a new tweet is given to the model, it will repeat all the above steps

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 118

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

except training the model. After going through these steps, vectorized representation of a sentence is obtained. This vectorized representation is then given to previously trained classifier model as input and it classifies the tweet depending on its content. 5. MATHEMATICAL MODEL The Proposed model can be represented in mathematical model as follows Term frequency inverse document frequency (TF-IDF) of words in given corpus is calculated by

Where : predicted outcome C : classifier function

Here, we used 2 classifier models. (Bernoulli Naive Bayes and Bagged SVM) for performance comparison

1.) Naive Bayes argmax(

...(1) Where, ) t - terms a - individual document D - collection of document tf - term frequency i.e. number of times words appear in each document

2.) Bagged Support Vector Machines As given in [12], Support Vector Machines can be bagged as

idf - inverse document frequency calculated by where, Hm : Sequence of classifiers m : 1,....,M M : Number of classifiers in bagging Using (1) all equation are vectorized. : Learning parameter Let Vi represent vectorized sentence i, then general classifier is represented using

ISSN:0975-887

6. RESULT & DISCUSSION We used dataset developed by [10] and further modified it to fit the needs for

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 119

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

classification of the proposed system. This dataset originally contained 3 categories: 1)Normal tweets 2)Offensive tweets

comparison of various predictive metrics for 2 models which are used for the training.

Results

Bernoulli Naive Bayes'

Bagged SVM

3)Tweets containing hate speech

Only 2 categories are used for the experimentation:- Normal tweets and effensive tweets. Hate speech which also contained offensive tweets are filtered and are treated as offensive tweet only. The proposed model is implemented in Scikit-learn library[11] in order to obtain results. Following table shows the

Accurac y

0.9292543021 0.9492245592

Precision

0.9439205955 0.9700460829

Recall

0.9726412682 0.968805932

F1-Score

0.9580657348 0.9694256108

Table 2: Performance metrics for Bernoulli Naive Bayes‘ and Bagged SVM

Figure 2 : Bar chart for different metric comparison between the two models

From Figure 2, it can be inferred that both models yield almost same accuracy but by considering other metrics, Bagged SVM performs better than Bernoulli Naive Bayes‘. 7. FUTURE WORK Traditionally content moderation is done manually. This manual work can be ISSN:0975-887

reduced using the proposed system. Currently, the proposed system is for textual data but in the future, this can be extended to images, videos, and audio. Further, an efficient model with higher efficiency can be used to classify text data more effectively. Additionally, the algorithm to find out what is wrong with the content can also be designed. Manual

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 120

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Moderators will be less exposed to hate speeches and offensive if such models are implemented on large scale. 8. CONCLUSION This system mainly focuses on categorizing text data in two categories namely offensive and normal. This will help content moderators to review less offensive data. Content moderation process will be automated by the use of a machine learning technique.

[7]

Stevie Chancellor, Yannis Kalantidis, Jessica A. Pater, Munmun De Choudhury, David A. Shamma. ‖Multimodal Classification of Moderated Online Pro-Eating Disorder Content‖. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Pg. 3213-3226) on ACM (2017,May).

[8]

Sanafarin Mulla, Avinash Palave, ―Moderation Technique For Sexually Explicit Content‖. In 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT) at International Institute of Information Technology (I2IT), Pune (2016,September).

[9]

Félix Gómez Mármol,Manuel Gil Pérez ,Gregorio Martínez Pérez. ―Reporting Offensive Content in Social Networks: Toward a Reputation-Based Assessment Approach‖. In IEEE Internet Computing Volume 18 , Issue 2 , Mar.-Apr. 2014.

[10]

Davidson, Thomas and Warmsley, Dana and Macy, Michael and Weber, Ingmar. ‖Automated Hate Speech Detection and the Problem of Offensive Language‖. In proceedings of the 11th International AAAI Conference on Web and Social Media 2017, (Pg. 512-515).

[11]

Scikit-learn: A module for machine learning.

REFERENCE [1]

Facebook-https://www.facebook.com/ [Access Date: 19 Dec 2018].

[2]

Twitter-https://twitter.com/ [Access Date: 19 Dec 2018].

[3]

LinkedIn-https://in.linkedin.com/ [Access Date: 19 Dec 2018].

[4]

[5]

[6]

Marcos Rodrigues Saúde, Marcelo de Medeiros Soares, Henrique Gomes Basoni, Patrick Marques Ciarelli, Elias Oliveira. ―A Strategy for Automatic Moderation of a Large Data Set of Users Comments‖. In 2014 XL Latin American Computing Conference (CLEI) (2014,September). Facebook's 7,500 Moderators Protect You From the Internet's Most Horrifying Content. But Who's Protecting Them. https://www.inc.com/christinelagorio/facebook-content-moderatorlawsuit.html [Access Date: 19 Dec 2018]. Moderators who had to view child abuse content sue Microsoft, claiming PTSD.

https://scikit-learn.org [Access Date: 19 Dec 018]. [12]

Kristína Machová, František Barčák, Peter Bednár, ―A Bagging Method Using Decision Trees in the Role of Base Classifiers‖ in Acta Polytechnica Hungarica, Vol.3, No.2, 2006, 121-132, ISSN 1785-8860.

https://www.theguardian.com/technology/201 7/jan/11/microsoft-employees-child-abuselawsuit-ptsd [Access Date: 19 Dec 2018].

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 121

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SURVEY PAPER ON LOCATION RECOMMENDATION USING SCALABLE CONTENT-AWARE COLLABORATIVE FILTERING AND SOCIAL NETWORKING SITES Prof. Pramod P. Patil, Ajinkya Awati, Deepak Patil, Rohan Shingate, Akshay More Smt. Kashibai Navale College of Engineering,Pune [email protected] , [email protected], [email protected], [email protected], [email protected] ABSTRACT The location recommendation plays an essential role in helping people find interesting places. Although recent research has he has studied how to advise places with social and geographical information, some of which have dealt with the problem of starting the new cold users. Because mobility records are often shared on social networks, semantic information can be used to address this challenge. There the typical method is to place them in collaborative content-based filters based on explicit comments, but require a negative design samples for a better learning performance, since the negative user preference is not observable in human mobility. However, previous studies have demonstrated empirically that sampling-based methods do not work well. To this end, we propose a system based on implicit scalable comments Content-based collaborative filtering framework (ICCF) to incorporate semantic content and avoid negative sampling. We then develop an efficient optimization algorithm, scaling in a linear fashion with the dimensions of the data and the dimensions of the features, and in a quadratic way with the dimension of latent space. We also establish its relationship with the factorization of the plate matrix plating. Finally, we evaluated ICCF with a largescale LBSN data set in which users have text and content profiles. The results show that ICCF surpasses many competitors’ baselines and that user information is not only effective for improving recommendations, but also for managing cold boot scenarios. Keywords- Content-aware, implicit feedback, Location recommendation, social network, weighted matrix factorization. and using the profile to calculate the 1. INTRODUCTION As we think about the title of this paper is similarity with the new elements. We related to Recommender System which is recommend location that are more similar part of the Data mining technique. to the user's profile. Recommender Recommendation systems use different systems, on the other hand, ignore the technologies, but they can be classified properties of the articles and base their into two categories: collaborative and recommendations on community content-based filtering systems. Contentpreferences. They recommend the based systems examine the properties of elements that users with similar tastes and articles and recommend articles similar to preferences have liked in the past. Two those that the user has preferred in the users are considered similar if they have past. They model the taste of a user by many elements in common. building a user profile based on the One of the main problems of properties of the elements that users like recommendation systems is the problem of ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 122

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

cold start, i.e. when a new article or user is introduced into the system. In this study we focused on the problem of producing effective recommendations for new articles: the cold starting article. Collaborative filtering systems suffer from this problem because they depend on previous user ratings. Content-based approaches, on the other hand, can still produce recommendations using article descriptions and are the default solution for cold-starting the article. However, they tend to get less accuracy and, in practice, are rarely the only option. The problem of cold start of the article is of great practical importance Portability due to two main reasons. First, modern online the platforms have hundreds of new articles every day and actively recommending them is essential to keep users continuously busy. Second, collaborative filtering methods are at the core of most recommendation engines since then tend to achieve the accuracy of the state of the art. However, to produce recommendations with the predicted accuracy that require that items be qualified by a sufficient number of users. Therefore, it is essential for any collaborative adviser to reach this state as soon as possible. Having methods that producing precise recommendations for new articles will allow enough comments to be collected in a short period of time, Make effective recommendations on collaboration possible. In this paper, we focus on providing location recommendations novel scalable Implicit-feedback based Content-aware Collaborative Filtering (ICCF) framework. Avoid sampling negative positions by considering all positions not visited as negative and proposing a low weight ISSN:0975-887

configuration, with a classification, to the preference trust model. This sparse weighing and weighting configuration not only assigns a large amount of confidence to the visited and unvisited positions, but also includes three different weighting schemes previously developed for locations. A.Motivation 

In introductory part for the study of recommendation system, their application, which algorithm used for that and the different types of model, I decided to work on the Recommendation application which is used for e-commerce, online shopping, location recommendation, product recommendation lot of work done on that application and that the technique used for that application is Recommendation system using traditional data mining algorithms.



Approaches to the state of the art to generate recommendations only positive evaluations are often based on the content aware collaborative filtering algorithm. However, they suffer from low accuracy.

2. RELATED WORK Shuhui Jiang, Xueming Qian *, Member, IEEE, Tao Mei, Senior Member, IEEE and Yun Fu, Senior Member, IEEE‖ describe the Personalized Travel Sequence Recommendation on Multi-Source Big Social Media In this paper, we proposed a personalized travel sequence recommendation system by learning topical package model from big multisource social media: travelogues And community-contributed photos. The advantages of our work are 1) the system automatically mined user‘s and routes‘ travel topical preferences including the

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 123

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

topical interest, Cost, time and season, 2) we recommended not only POIs but also travel sequence, considering both the popularity and user‘s travel preferences at the same time. We mined and ranked famous routes based on the similarity between user package and route package [1]. Shuyao Qi, Dingming Wu, and Nikos Mamoulis describe that ,‖ Location Aware Keyword Query Suggestion Based on Document Proximity‖ In this paper, we proposed an LKS framework providing keyword suggestions that are relevant to the user information needs and at the same time can retrieve relevant documents Near the user location [2]. X. Liu, Y. Liu, and X. Li describe the ―Exploring the context of locations for personalized Location recommendations‖. In this paper, we decouple the process of jointly learning latent representations of users and locations into two separated components: learning location latent representations using the Skip-gram model, and learning user latent representations Using C-WARP loss [3]. H. Li, R. Hong, D. Lian, Z. Wu, M. Wang, and Y. Ge describe the ―A relaxed ranking-based factor model for recommender system from implicit feedback,‖ in this paper, we propose a relaxed ranking-based algorithm for item recommendation with implicit feedback, and design a smooth and scalable optimization method for model‘s parameter Estimation [4]. D. Lian, Y. Ge, N. J. Yuan, X. Xie, and H. Xiong describe the ―Sparse Bayesian collaborative filtering for implicit feedback,‖ In this paper, we proposed a sparse Bayesian collaborative filtering ISSN:0975-887

algorithm best tailored to implicit feedback, And developed a scalable optimization algorithm for jointly learning latent factors and hyper parameters [5]. E. X. He, H. Zhang, M.-Y. Kan, and T.S. Chua describe the ―Fast matrix factorization for online recommendation with implicit feedback,‖ We study the problem of learning MF models from implicit feedback. In contrast to previous work that applied a uniform weight on missing data, we propose to weight Missing data based on the popularity of items. To address the key efficiency challenge in optimization, we develop a new learning algorithm which effectively learns Parameters by performing coordinate descent with memorization [6]. F. Yuan, G. Guo, J. M. Jose, L. Chen, H. Yu, and W. Zhang, describe the ―Lambdafm: learning optimal ranking with factorization machines using lambda surrogates‖ In this paper, we have presented a novel ranking predictor Lambda Factorization Machines. Inheriting advantages from both LtR and FM, LambdaFM (i) is capable of optimizing various top-N item ranking metrics in implicit feedback settings; (ii) is very exible to incorporate context information for context-aware recommendations [7]. Yiding Liu1 TuanAnh Nguyen Pham2 Gao Cong3 Quan Yuan describe the An Experimental Evaluation of Pointofinterest Recommendation in Locationbased Social Networks-2017 In this paper, we provide an all around Evaluation of 12 state-of-theart POI recommendation models. From the evaluation, we obtain several important findings, based on which we can better understand and utilize POI

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 124

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

recommendation scenarios [8].

Models

in

various

Salman Salamatian_, Amy Zhangy, Flavio du Pin Calmon_, Sandilya Bhamidipatiz, Nadia Fawazz, Branislav Kvetonx, Pedro Oliveira{, Nina Taftk describe the ―Managing your Private and Public Data: Bringing down Inference Attacks against your Privacy‖ In this paper, they propose an ML framework for content-aware collaborative filtering from implicit feedback datasets, and develop coordinate descent for efficient and Effective parameter learning [9]. Zhiwen Yu, Huang Xu, Zhe Yang, and Bin Guo describe the ―Personalized Travel Package With Multi-Point-of-Interest Recommendation Based on Crowdsourced User Footprints‖ In this paper, we propose an approach for personalized travel package recommendation to help users make travel Plans. The approach utilizes data collected from LBSNs to model users and locations, and it determines users‘ preferred destinations using collaborative Filtering approaches. Recommendations are generated by jointly considering user preference and spatiotemporal constraints. A heuristic search-based travel route planning algorithm was designed to generate Travel packages [10]. 3. EXISTING SYSTEM Lot of work has been done in this field because of its extensive usage and applications. In this section, some of the approaches which have been implemented to achieve the same purpose are mentioned. These works are majorly differentiated by the algorithm for recommendation systems. In another research, general location route planning cannot well meet users‘ personal requirements. Personalized ISSN:0975-887

recommendation recommends the locations and routes by mining user‘s travel records. The most famous method is location-based matrix factorization. To similar social users are measured based on the location co-occurrence of previously visited locations. Recently, static topic model is employed to model travel preferences by extracting travel topics from past traveling behaviours which can contribute to similar user identification. However, the travel preferences are not obtained accurately, because all travel histories of a user as one document drawn from a set of static topics, which ignores the evolutions of topics and travel preferences. As my point of view when I studied the papers the issues are related to recommendation systems. The challenge is to addressing cold start problem from implicit feedback is based on the detection of recommendation between users and location with similar preference. 4. PROPOSED SYSTEM As I studied then I want to propose content aware collaborative filtering and baseline algorithm, firstly find nearby locations i.e. places, hotels and then to recommend to user based on implicit feedback and achieve the high accuracy and also remove cold-start problem in recommendation system. In this system, particular Recommendation of places for new users. A general solution is to integrate collaborative filtering with content based filtering from this point of view of research, some popular. Contentbased collaboration filtering frameworks, have been recently Proposed, but designed on the basis of explicit feedback with favourite samples both positively and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 125

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

negatively. Such as Only the preferred samples are implicitly provided in a positive way. Feedback data while it is not practical to treat all unvisited locations as negative, feeding the data on mobility together. With user information and location in these explicit comments Frames require pseudo-negative drawings. From places not visited. The samples and the lack of different levels of trust cannot allow them to get the comparable top-k recommendation. 5. System Architecture:

Fig. System Architecture

6. CONCLUSION In this Paper, we propose an ICCF framework for collaborative filtering based on content based on implicit feedback set of data and develop the coordinates of the offspring for effective learning of parameters. We establish the close relationship of ICCF with matrix graphical factorization and shows that user functions really improve mobility Similarity between users. So we apply ICCF for the Location recommendation on a large-scale LBSN data set. our the results of the experiment indicate that ICCF is greater than five competing baselines, including two leading positions recommendation and factoring algorithms based on the ranking machine. When comparing different weighting schemes for negative preference of the unvisited places, we observe that the user-oriented scheme is superior to that

ISSN:0975-887

oriented to the element Scheme, and that the sparse configuration and rank one significantly improves the performance of the recommendation. REFERENCES [1] Shuhui Jiang, Xueming Qian *, Member, IEEE, Tao Mei, Senior Member, IEEE and Yun Fu, Senior Member, IEEE‖ Personalized Travel Sequence Recommendation on MultiSource Big Social Media‖ Transactions on Big Data IEEE TRANSACTIONS ON BIG DATA, VOL. X, NO. X, [2] Shuyao Qi, Dingming Wu, and Nikos Mamoulis,‖ Location Aware Keyword Query Suggestion Based on Document Proximity‖ VOL. 28, NO. 1, JANUARY 2016. [3] X. Liu, Y. Liu, and X. Li, ―Exploring the context of locations for personalized Location recommendations,‖ in Proceedings of IJCAI‘16. AAAI, 2016. [4] H. Li, R. Hong, D. Lian, Z. Wu, M. Wang, and Y. Ge, ―A relaxed ranking-based factor model for recommender system from implicit feedback,‖ in Proceedings of IJCAI‘16, 2016, pp. 1683–1689. [5] D. Lian, Y. Ge, N. J. Yuan, X. Xie, and H. Xiong, ―Sparse Bayesian collaborative filtering for implicit feedback,‖ in Proceedings of IJCAI‘16. AAAI, 2016. [6] X. He, H. Zhang, M.-Y. Kan, and T.-S. Chua, ―Fast matrix factorization for online recommendation with implicit feedback,‖ in Proceedings of SIGIR‘16, vol. 16, 2016. [7] Yuan, G. Guo, J. M. Jose, L. Chen, H. Yu, and W. Zhang, ―Lambdafm: learning optimal ranking with factorization machines using lambda surrogates,‖ in Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 2016, pp. 227–236. [8] Yiding Liu1 TuanAnh Nguyen Pham2 Gao Cong3 Quan Yuan,‖ An Experimental Evaluation of Pointofinterest Recommendation in Locationbased Social Networks-2017‖. [9] Salman Salamatian_, Amy Zhangy, Flavio du Pin Calmon_, Sandilya Bhamidipatiz, Nadia Fawazz, Branislav Kvetonx, Pedro Oliveira{, Nina Taftk ―Managing your Private and Public Data: Bringing down Inference Attacks against your Privacy‖ 2015. [10] Zhiwen Yu, Huang Xu, Zhe Yang, and Bin Guo ―Personalized Travel PackageWith MultiPoint-of-Interest Recommendation Based on Crowdsourced User Footprints‖ 2016

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 126

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Anonymous Schedule Generation Using Genetic Algorithm Adep Vaishnavi Anil1, Berad Rituja Shivaji2, Myana Vaishnavi Dnyaneshwar3, Pawar Ashwini Janardhan4 1,2,3,4

Computer Engineering, SCSMCOE,Nepti, Ahmednagar, India [email protected], [email protected], [email protected], [email protected]

ABSTRACT In this proposed system, a genetic algorithm is applied to automatic schedule generation system to generate course timetable that best suit student and teachers needs. Preparing schedule in colleges and institutes is very difficult task for satisfying different constraints. Conventional process of scheduling is very basic process of generating schedule for any educational organization .This study develop a practical system for generation of schedule .By taking complicated constraints in consideration to avoid conflicts in schedule. Conflicts means that generate problem after allocation of time slots. Keywords Genetic Algorithm (GA), Constraints, Chromosomes, Genetic Operators. constraints include [1] Each time slot 1. INTRODUCTION should be scheduled to a specified time Preparing timetable is most complicated .[2] Each teacher or student can be and conflicting process .The traditional allocated only one classroom at a time.[3] way of generating timetable still have the All students must be fit into that particular error prone output, even if it is prepared allocated classroom. Some of the software repeatedly for suitable output .The aim of constraints include [1] Both faculty and our application is to make the process student should not unconnected timeslots simple easily understanding and efficient in timetable.[2] Classroom have limited and also with less time requirements capacity. therefore there is a great need of this kind of application in educational institute. 2. ALGORITHM Timetable generating has been in most Step1: Partition the training set Tr into m of the human requirements and it is most subsets through random sampling; widely used for educational institutes like schools, colleges and other institutes, Step2: Apply decision tree algorithm to where we need planning of courses, each subsets S 2S m; subjects and hours. In earlier days Step3: Apply each included tree from timetable scheduling was a manual process step2 (Tree, Tree2 Tree m) to the test set T where a one person or the group of some e; peoples are involved in this process and the create timetable with their hands, Step4: Use fitness function to evaluate which take more efforts and still the output performance of all trees, and rank the trees is not appropriate. with their related subsets according to trees‘ performance; The courses scheduling problem can be specified as constraint satisfaction problem Step5: Perform GA operations: (CSP). Constraints in the scheduling process can be categories into two Selection: select the top (1 – c)m subsets constraints Hardware Constraints and and keep them intact into next operation; software Constraints. Common hardware ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 127

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Crossover: for remaining cm 12 pairs, perform two points crossover; Mutation: randomly select mu subsets to perform mutation operation. Randomly replace one instance in selected subset by one instance randomly selected from the original training data set. Step6: New subsets are created from step5 as the next new generation, then replicates step2 to step6,until identify a subset and a related tree with ideal performance. 1. Input data: The first step in functioning of GA is generation of an initial input data, each individual is evaluated and assigned a fitness value according to positive fitness function. 2. Selection: This operator select chromosome in data for reproduction. The better chromosome to fit, the more times it is likely to be selected to reproduce. 3. Crossover: It is a genetic operator is used to vary coding of a chromosome from one generation to the next. In crossover process it takes one or more than one parent solution and find the child solutions from the parent solution. 4. Mutation: In mutation solution may change from the previous one solution. Mutation is the process in which the data can be interchange for the best solution. When the given solution is not reliable or there is conflicts are available then mutation and crossover techniques are very important. It decides which result is best for given input data.

Fitness function is used to find the quality of represented function. This function is problem dependent. Infield of genetic algorithm design solution is represented as a string it refers as chromosome .In the each phase of testing it delete the ‗n‘ worst result or condition and create ‗n‘ new ones from the best design solution and the final result is obtained from that solution. 3. PROPOSED SYSTEM In this proposed system is based on customer centric strategy in designing of scheduling system. Firstly a data mining algorithm is design for mining student preferences in different course selection from historical data. Then based on selection pattern obtain from mining algorithm for scheduling is designed, which leads to develop an integrative, automatic course scheduling system. This system is not only help to increase the student satisfaction of course scheduling system result. In this proposed system adopts the user‘s perspective and applies different types of techniques to an automatic scheduling and also considers teacher preferences and student needs in their schedule, so that final output fulfills the expectations at each and every users. This algorithm is used for exchanging course that are given to the system as input, so as to find optimal solution for timetabling problem 4. SYSTEM ARCHITECTURE Input data: 1.Courses 2.Labs 3.Lectures 4.Sems 5.Students

5. Fitness Function:

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 128

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Fig : System Architecture

Output data: System constraints categories into two parts: 1. Hard constraint: a. Each course should be scheduled to a specified time. b. Each teacher or student can be allocated only one class at a time.

5. ACKNOWLEDGMENTS We are thankful to Prof. Lagad J. U., Prof. Tambe R. ,Prof. Pawar S.R. ,Prof. Jadhav H. ,Prof. Avhad P. Department of Computer Engineering, Shri Chhatrapati Shivaji Maharaj College Of Engineering. REFERENCES [1] Meysam

c. All students assigned to particular assigned class must be able to fit into that class. [2]

2. Soft constraint: a. Some of the soft constraint include faculty and student should not have unconnected time slots in time table.

[3]

b. Classrooms have limited capacity. c. Student should not have any free time between two classes on a day.

[4]

[5]

ISSN:0975-887

Shahvali Kohshori, Mohammad sanieeabadeh,Hedieh Sajedi ―A Fuzzy genetic algorithm with local search for university course timetabling problem‖, 2008 20th IEEE International conference on tools with artificial intelligence. Antariksha Bhaduri ―University Time Table Scheduling using Genetic Artifical Immune Network‖2009 International conference on advances in Recent Technologies in Communication and Computing. Sadaf N.Jat , Shengxiang Yang ―A mimetic algorithm for university course timetabling problem‖, 2008 20th IEEE International Conference on tools with artificial intelligence. Mosaic Space Blog, ―The Practice and Theory of automated Timetabling‖PATAT2010,Mosaic Space Blog,University and College planning and management retrieved,from http://mosaic.com /blog,2011,Last accessed date 21st January 2012 . Hitoshi kanoh, Yuusuke sakamoto ―Interactive timetabling system using knowledge based

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 129

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[6]

[7]

[8]

[9]

genetic algorithm‖ , 2004 IEEE International [10] Conference on systems,man and cybernetics. De Werra D., ―An introduction to timetabling‖, European Journal of Operations Research,Vol. [11] 19,1985,pp. 151-162. A. I. S. Even and A. Shamir., ―On the complexity of timetabling and multicommodity flow problems.‖ SIAM Journal of Computation, pp. 691-703, 1976. [12] D. E. Goldberg, ―Genetic Algorithm in Search, Optimization and Machine Learning‖ .‖Hardcover‖, 1989. [13] L. Davis , ―Handbook of genetic algorithms‖. ―Van Nostrand Reinhold‖,1991.

ISSN:0975-887

AnujaChowdhary ―TIME TABLE GENERATION SYSTEM‖.Vol.3 Issue.2,February-2014,pg. 410-414. DilipDatta, Kalyanmoy Deb, Carlos M. Fonseca, ―Solving Class Timetiabling Problem of IIT Kanpur using MultiObjective Evaluatioary Algorithm‖ KanGal 2005. Melanie Mitchell, ―An Introduction To Genetic Algorithm‖, A Bradford Book The MIT Press, Fifth printing 1999. M. Ayob and G. Jaradat, ―Hybrid ant colony systems for cours timtabling problems,‖ in Data Mining and Optimization, 2009.DmO‘09. 2nd Confrence on, 2009,pp.120-126.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 130

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

A Survey on Unsupervised Feature Learning Using a Novel Non Symmetric Deep Autoencoder(NDAE) For NIDPS Framework Vinav Autkar1, Prof. P. R. Chandre2, Dr. Purnima Lala Mehta3 1,2

Department of Computer Engineering, Smt. Kashibai Navale College Of Engineering Savitribai Phule Pune University, Pune 3 Department of ECE, HMR Institute of Technology and Managmenet Delhi [email protected], [email protected] [email protected]

ABSTRACT Repetitive and in material highlights in data have caused a whole deal issue in system traffic classification. Lately, one of the fundamental concentrations inside (Network Intrusion Detection System) NIDS investigate has been the use of machine learning and shallow learning strategies. This paper proposes a novel profound learning model to empower NIDS activity inside present day systems. The model demonstrates a blend of profound and shallow learning, prepared to do accurately investigating a wide-scope of system traffic. The system approach proposes a Non-symmetric Deep Auto-Encoder (NDAE) for unsupervised feature learning. Also, furthermore proposes novel profound learning order show constructed using stacked NDAEs. Our proposed classifier has been executed in Graphics preparing unit (GPU)- engaged TensorFlow and surveyed using the benchmark utilizing KDD Cup '99 and NSL-KDD datasets. The execution assessed organize interruption location examination datasets, especially KDD Cup 99 and NSL-KDD dataset. However the to cover-up the Limitation of KDD dataset in proposed system WSN trace dataset has been used. The commitment work is to execute interruption counteractive action framework (IPS) contains IDS usefulness however progressively complex frameworks which are fit for making quick move so as to forestall or diminish the vindictive conduct. General Terms Non Symmetric Deep Auto-Encoder, Restricted Boltzman Machine, Deep Belief Network. Keywords Deep learning, Anomaly detection, Autoencoders, KDD, Network security 1. INTRODUCTION One of the real difficulties in system security is the arrangement of a powerful and successful Network Intrusion Detection System (NIDS). Regardless of the critical advances in NIDS innovation, most of arrangements still work utilizing less-able mark based strategies, rather than irregularity recognition methods. The present issues are the current systems prompts ineffectual and wrong discovery of assaults. There are three fundamental confinements like, volume of system ISSN:0975-887

information, inside and out observing and granularity required to enhance adequacy and precision lastly the quantity of various conventions and assorted variety of information crossing. The primary focus on developing NIDS has been the use of machine learning and shallow learning techniques. The underlying profound learning research has shown that its unrivaled layer-wise element learning can better or possibly coordinate the execution of shallow learning procedures. It is equipped for encouraging a more profound

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 131

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

examination of system information and quicker recognizable proof of any peculiarities. In this paper, we propose a 2. MOTIVATION  A new NDAE technique for unsupervised feature learning, which not like typical autoencoder approaches provides non-symmetric data dimensionality reduction. Hence, our technique is able to ease improved classification results when compared with leading methods such as Deep Belief Networks (DBNs).  A novel classifier model that utilizes stacked NDAEs and the RF classification algorithm. By combining both deep and shallow learning techniques to exploit their strengths and decrease analytical overheads. We should be able to get better results from similar research, at the same time as significantly reducing the training time. 3. REVIEW OF LITERATURE The paper [1] focuses on deep learning methods which are inspired by the structure depth of human brain learn from lower level characteristic to higher levels concept. It is because of abstraction from multiple levels, the Deep Belief Network (DBN) helps to learn functions which are mapping from input to the output. The process of learning does not dependent on human-crafted features. DBN uses an unsupervised learning algorithm, a Restricted Boltzmann Machine (RBM) for each layer. Advantages are: Deep coding is its ability to adapt to changing contexts concerning data that ensures the technique conducts exhaustive data analysis. Detects abnormalities in the system that includes anomaly detection, traffic identification. Disadvantages are: Demand for faster and efficient data assessment. The main purpose of [2] paper is to review and summarize the work of deep learning on machine health monitoring. The applications of deep learning in ISSN:0975-887

new deep learning model for NIDPS for present day systems. machine health monitoring systems are reviewed mainly from the following aspects: Auto-encoder (AE) and its variants, Restricted Boltzmann Machines and its variants including Deep Belief Network (DBN) and Deep Boltzmann Machines (DBM), Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN). Advantages are: DLbased MHMS do not require extensive human labor and expert knowledge. The applications of deep learning models are not restricted to specific kinds of machines. Disadvantages are: The performance of DL-based MHMS heavily depends on the scale and quality of datasets. Proposes the use of a stacked denoising autoencoder (SdA), which is a deep learning algorithm, to establish an FDC model for simultaneous feature extraction and classification. The SdA model [3] can identify global and invariant features in the sensor signals for fault monitoring and is robust against measurement noise. An SdA is consisting of denoising autoencoders that are stacked layer by layer. This multilayered architecture is capable of learning global features from complex input data, such as multivariate time-series datasets and highresolution images. Advantages are: SdA model is useful in real applications. The SdA model proposes effectively learn normal and fault-related features from sensor signals without preprocessing. Disadvantages are: Need to investigate a trained SdA to identify the process parameters that most significantly impact the classification results. Proposes a novel deep learningbased recurrent neural networks (RNNs)model [4] for automatic security audit of short messages from prisons, which can classify short messages(secure and non-insecure).In this paper, the feature

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 132

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

of short messages is extracted by word2vec which captures word order information, and each sentence is mapped to a feature vector. In particular, words with similar meaning are mapped to a similar position in the vector space, and then classified by RNNs. Advantages are: The RNNs model achieves an average 92.7% accuracy which is higher than SVM. Taking advantage of ensemble frameworks for integrating different feature extraction and classification algorithms to boost the overall performance. Disadvantages are: It is apply on only short messages not largescale messages. Signature-based features technique as a deep convolutional neural network [5] in a cloud platform is proposed for plate localization, character detection and segmentation. Extracting significant features makes the LPRS to adequately recognize the license plate in a challenging situation such as i) congested traffic with multiple plates in the image ii) plate orientation towards brightness, iii) extra information on the plate, iv) distortion due to wear and tear and v) distortion about captured images in bad weather like as hazy images. Advantages are: The superiority of the proposed algorithm in the accuracy of recognizing LP rather than other traditional LPRS. Disadvantages are: There are some unrecognized or missdetection images. In [6] paper, a deep learning approach for anomaly detection using a Restricted Boltzmann Machine (RBM) and a deep belief network are implemented. This method uses a one-hidden layer RBM to perform unsupervised feature reduction. The resultant weights from this RBM are passed to another RBM producing a deep belief network. The pre-trained weights are passed into a fine tuning layer consisting of a Logistic Regression (LR) classifier with multi-class soft-max. Advantages are: Achieves 97.9% accuracy. It produces a low false negative ISSN:0975-887

rate of 2.47%. Disadvantages are: Need to improve the method to maximize the feature reduction process in the deep learning network and to improve the dataset. The paper [7] proposes a deep learning based approach for developing an efficient and flexible NIDS. A sparse autoencoder and soft-max regression based NIDS was implemented. Uses Self-taught Learning (STL), a deep learning based technique, on NSL-KDD - a benchmark dataset for network intrusion. Advantages are: STL achieved a classification accuracy rate more than 98% for all types of classification. Disadvantages are: Need to implement a real-time NIDS for actual networks using deep learning technique. In [8] paper choose multi-core CPU‘s as well as GPU‘s to evaluate the performance of the DNN based IDS to handle huge network data. The parallel computing capabilities of the neural network make the Deep Neural Network (DNN) to effectively look through the network traffic with an accelerated performance. Advantages are: The DNN based IDS is reliable and efficient in intrusion detection for identifying the specific attack classes with required number of samples for training. The multicore CPU‘s was faster than the serial training mechanism. Disadvantages are: Need to improve the detection accuracies of DNN based IDS. In [9] paper, proposes a mechanism for detecting large scale network-wide attacks using Replicator Neural Networks (RNNs) for creating anomaly detection models Our approach is unsupervised and requires no labeled data. It also accurately detects network-wide anomalies without presuming that the training data is completely free of attacks. Advantages are: The proposed methodology is able to successfully discover all prominent DDoS attacks and SYN Port scans injected. Proposed methodology is resilient against

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 133

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

learning in the presence of attacks, something that related work lacks. Disadvantages are: Need to improve proposed methodology by using stacked autoencoder deep learning techniques.

such information. Ordering a colossal measure of information for the most part causes numerous numerical troubles which at that point lead to higher computational complexity.

Based on the flow-based nature of SDN, we propose a flow-based anomaly detection system using deep learning. In [10] paper, apply a deep learning approach for flow-based anomaly detection in an SDN environment. Advantages are :It finds an optimal hyper-parameter for DNN and confirms the detection rate and false alarm rate. The model gets the performance with accuracy of 75.75% which is quite reasonable from just using six basic network features. Disadvantages are: It will not work on real SDN environment.

5. SYSTEM OVERVIEW In this paper,[11] propose a novel deep learning model to enable NIDS operation within modern networks. The model proposes is a combination of deep and shallow learning, capable of correctly analyzing a wide-range of network traffic. More specifically, we combine the power of stacking our proposed Non-symmetric Deep Auto-Encoder (NDAE) (deep learning) and the accuracy and speed of Random Forest (RF) (shallow learning). This paper introduces our NDAE, which is an auto-encoder featuring non-symmetrical multiple hidden layers. NDAE can be used as a hierarchical unsupervised feature extractor that scales well to accommodate high-dimensional inputs. It learns nontrivial features using a similar training strategy to that of a typical auto-encoder. Stacking the NDAEs offers a layer-wise unsupervised representation learning algorithm, which will allow our model to learn the complex relationships between different features. It also has feature extraction capabilities, so it is able to refine the model by prioritizing the most descriptive features.

4. OPEN ISSUES ` The present system traffic information, which are regularly enormous in size, present a noteworthy test to IDSs These "Big Data" back off the whole location process and may prompt unsuitable grouping precision because of the computational troubles in taking care of such information. Machine learning innovations have been normally utilized in IDS. In any case, a large portion of the conventional machine learning innovations allude to shallow learning; they can't viably understand the gigantic interruption information order issue that emerges despite a genuine system application condition. Also, shallow learning is contradictory to wise examination and the foreordained necessities of highdimensional learning with colossal information. Disadvantage: Computer frameworks and web have turned into a noteworthy piece of the basic framework. The present system traffic information, which are regularly gigantic in size, present a noteworthy test to IDSs. These "Big Data" back off the whole recognition process and may prompt inadmissible grouping precision because of the computational challenges in dealing with ISSN:0975-887

The existing system in the paper have used NSL KDD dataset which is refined version of KDD 99 Dataset. The NSL KDD dataset is use for IDS. Which has 41 features which make it more accurate. However It has a limitation that it can‘t be used for Wireless Network. So to Overcome this limitation of NSL KDD dataset in proposed system WSN dataset has been used. The WSN dataset has 12 attribute which are given in table I

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 134

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Table I WSN Trace Dataset Attributes

Total Attributes Event

protocol_used

Time

port_number

from_node

transmission_rate_kbps

to_node

received_rate_kbps

hopcount

drop_rate_kbps

Fig. 1 Proposed System Architecture

Advantages are: Due to deep learning technique, it improves accuracy of intrusion detection system.

packet_size Class 

Fig. 1 shows the proposed system architecture of Network Intrusion Detection and Prevention System (NIDPS). The input traffic data is uses for WSN dataset with 12 features. The training dataset contains data preprocessing which includes two steps: Data transformation and data normalization. After uses two NDAEs arranged in a stack, which uses for selecting number of features. After that apply the Random Forest Classifier for attack detection. Intrusion Prevention Systems (IPS) contains IDS functionality but more sophisticated systems which are capable of taking immediate action in order to prevent or reduce the malicious behavior.

ISSN:0975-887



 



The network or computer is constantly monitored for any invasion or attack. The system can be modified and changed according to needs of specific client and can help outside as well as inner threats to the system and network. It effectively prevents any damage to the network. It provides user friendly interface which allows easy security management systems. Any alterations to files and directories on the system can be easily detected and reported.

6. ALGORITHM A Deep Belief Network (DBN)[11] is a complex sort of generative neural system that utilizes an unsupervised machine learning model to deliver results. This kind of system outlines a portion of the work that has been done as of late in utilizing generally unlabeled information to construct unsupervised models. A few specialists depict the Deep Belief Network as a lot of limited Boltzmann machines (RBMs) stacked over each other. When all is said in done, profound conviction

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 135

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

systems are made out of different littler unsupervised neural systems. One of the regular highlights of a DBN is that in spite of the fact that layers have associations between them, the system does exclude associations between units in a solitary layer. It uses Stacked Restricted Boltzmann Machine Which has a two layer called hidden layer and visible layer. The rule status monitoring algorithm has been use to recognize and detect the attack. We define a rule set as a file consisting of a set (or category) of rules that share a common set of characteristics. Our goal is to develop an algorithm that monitors the collection of rule sets so as to identify the state of each rule in each rule set, in terms of whether it is enabled or disabled, and to build useful statistics based on these findings. The algorithm should also provide periodic updates of this information. This may be accomplished by running it as a daemon with an appropriately selected specified period. 7. Mathematical Model 7.1. Preprocessing: In this step, training data source (T) is normalized to be ready for processing by using following steps:

normalized using the same follows:

and

as

(2) 2. Feature Selection: NDAE is an auto-encoder featuring nonsymmetrical multiple hidden layers. The proposed NDAE takes an input vector and step-by-step maps it to the latent representations (here d represents the dimension of the vector) using a deterministic function shown in (3) below: (3) Here, is an activation function (in this work use sigmoid function and n is the number of hidden layers. Unlike a conventional auto-encoder and deep auto-encoder, the proposed NDAE does not contain a decoder and its output vector is calculated by a similar formula to (4) as the latent representation. (4) The estimator of the model can be obtained by minimizing the square reconstruction error over m training samples , as shown in (5). (5)

(1) Where,

T is m samples with n column attributes; is the jth column attribute in ith sample, and are 1*n matrix which are the training data mean and standard deviation respectively for each of the n attributes. Test dataset (TS) which is used to measure detection accuracy is ISSN:0975-887

8. CONCLUSION AND FUTURE WORK In this paper, we have discussed the problems faced by existing NIDS techniques. In response to this we have proposed our novel NDAE method for unsupervised feature learning. We have then built upon this by proposing a novel classification model constructed from stacked NDAEs and the RF classification algorithm. Also we implemented the Intrusion prevention system. The result shows that our approach offers high levels

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 136

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

of accuracy, precision and recall together with reduced training time. The proposed NIDS system is improved only 5% accuracy. So, there is need to further improvement of accuracy. And also further work on real-time network traffic and to handle zero-day attacks. REFERENCES [1] B. Dong and X. Wang, ―Comparison deep learning method to traditional methods using for network intrusion detection,‖ in Proc. 8th IEEE Int.Conf. Commun. Softw. Netw, Beijing, China, Jun. 2016, pp. 581–585. [2] R. Zhao, R. Yan, Z. Chen, K. Mao, P. Wang, and R. X. Gao, ―Deep learning and its applications to machine health monitoring: A survey,‖ Submitted to IEEE Trans. Neural Netw. Learn. Syst., 2016. [Online]. Available: http://arxiv.org/abs/1612.07640 [3] H. Lee, Y. Kim, and C. O. Kim, ―A deep learning model for robust wafer fault monitoring with sensor measurement noise,‖ IEEE Trans. Semicond. Manuf., vol. 30, no. 1, pp. 23–31, Feb. 2017. [4] L. You, Y. Li, Y. Wang, J. Zhang, and Y. Yang, ―A deep learning based RNNs model for automatic security audit of short messages,‖ in Proc. 16th Int. Symp. Commun. Inf. Technolf., Qingdao, China, Sep. 2016, pp. 225–229. [5] R. Polishetty, M. Roopaei, and P. Rad, ―A next-generation secure cloud based deep learning license plate recognition for smart cities,‖ in Proc. 15th IEEE Int. Conf.Mach.

ISSN:0975-887

[6] Learn. Appl., Anaheim, CA, USA, Dec. 2016, pp. 286–293. [7] K. Alrawashdeh and C. Purdy, ―Toward an online anomaly intrusion detection system based on deep learning,‖ in Proc. 15th IEEE Int. Conf. Mach. Learn. Appl., Anaheim, CA, USA, Dec. 2016, pp. 195–200. [8] A. Javaid, Q. Niyaz, W. Sun, and M. Alam, ―A deep learning approach for network intrusion detection system,‖ in Proc. 9th EAI Int.Conf. Bio-Inspired Inf. Commun. Technol., 2016, pp. 21–26. [Online]. Available: http://dx.doi.org/10.4108/eai.3-122015.2262516 [9] S. Potluri and C. Diedrich, ―Accelerated deep neural networks for enhanced intrusion detection system,‖ in Proc. IEEE 21st Int. Conf. Emerg. Technol. Factory Autom., Berlin, Germany, Sep. 2016, pp. 1–8. [10] C. Garcia Cordero, S. Hauke, M. Muhlhauser, and M. Fischer, ―Analyzing flow-based anomaly intrusion detection using replicator neural networks,‖ in Proc. 14th Annu. Conf. Privacy, Security. Trust, Auckland, New Zeland, Dec. 2016, pp. 317–324. [11] T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho, ―Deep learning approach for network intrusion detection in software defined networking,‖ in Proc. Int. Conf. Wireless Netw. Mobile Commun., Oct. 2016, pp. 258–26 [12] Nathan shone , trannguyenngoc, vu dinhphai , and qi sh, ―a deep learning approach to network intrusion detection‖,ieee transactions on emerging topics in computational intelligence, vol. 2, no. 1, february 2018.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 137

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

TURING MACHINE IMITATE ARTIFICIAL INTELLIGENCE Tulashiram B. Pisal1, Prof. Dr. Arjun P. Ghatule2

1

Research Scholar, Sinhgad Institute of Computer Sciences, Pandharpur(MS),India 2 Director, Board of Examinations and Evaluation,University of Mumbai, India [email protected], [email protected]

ABSTRACT A Turing Machine is the mathematical tool corresponding to a digital computer. It is a widely used the model of computation in computability and complexity theory. According to Turing‘s hypothesis, If Turing machine computes function then only compute, it by algorithmically. The problems which are not solved by a Turing machine that problems cannot be solved by any modern computer program. It accepts all types of languages. A Turing machine manipulates symbols on a tape according to transition rules. Due to its simplicity, a Turing machine can be amended to simulate the logic of any computer algorithm and is particularly useful in explaining the functions of a central processing unit (CPU) inside a computer. A Turing machine is able to imitate Artificial Intelligence. General Terms Turing Machine implements the machine learning. Keywords Turing Machine, Artificial Intelligence, Finite Automata, Push Down Automata, Transaction Diagram, Turing Test The answer given by both human and 1. INTRODUCTION computer wouldn‘t be able to distinguish Turing machine was introduced by 1930 by the interrogator. The computer passed by Alan Turing for computation. The the test is providing computer is intelligent Turing test was developed by Alan Turing as human[2-3]. Both computer and in 1950[1]. He proposed that ―Turing test humans, the whole conversation is only is used to determine whether or not a through a computer keyboard and screen. computer or machine can think intelligently like a human‖? The abstract machine could not be The game of three players is playing in designed without consideration of the which two players are human and one is a Turing test. Turing test is represented logic computer. One human is interrogator by using the symbols for better which job is to find out which one is understanding. Before the study of human and which one is computer by cognitive science, we could not conclude asking questions to both of them but machine thinking as like human[4-5]. distinguish a computer from a human is a harder task and the guess of interrogator become wrong. Turing test is shown in the following figure Fig.1. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 138

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Sr.No Machine Name

Fig.1: Turing Test

A test to empirically determine whether a computer is achieved intelligence. A Turing test combines both human behaviours and intelligent behaviours[6-8]. The Turing test uses natural language processing to communicate with the computer. Turing test plays a crucial role in artificial intelligence for game playing. In game playing like chess, tennis the computer can beat world class player. A game playing has numerous possible moves for the single move of the opponent to reach at goal state with the optimal solution. A computer is be able to play the imitation game which not given chance to the interrogator of making the right identification of player those are machine or human[9]. The artificial intelligence covered all games ground over the world from Turing wrote his paper in 1950. Imitation game and computer game boats are significance role played for game playing[10-11]. It also plays important roles in all other games. 2. POWER OF MACHINES The entire real machine handled all operations handle by Turing machine with intelligence[12]. A real machine has only limited a finite number of formations. The actually real machine is a linear bounded automaton. Due to infinite tape to both ISSN:0975-887

Data Structure No

Nature

1

Finite Automata

2

Pushdown Stack Automata

NonDeterministic

3

Turing Machine

Deterministic

Infinite tape to both side

Deterministic

sides, Turing machines have an unconstrained amount of storage space to the computations. The Finite Automata (FA), Push Down Automata (PDA) and Post Machine have no control over the input and they cannot modify its own input symbols. PDA has two types deterministic Push Down Automata (DPDA) and Non-deterministic Push Down Automat (NPDA). NPDA is more powerful than DPADA. The Turing Machine (TM) is more powerful due to their deterministic nature[17-18]. The comparative nature of various machines is shown in following table 1. Table 1. Deterministic Nature of Machines

Turing machines simplify the statement of algorithms to run in memory while the real machine has a problem to enlarge the memory space. The power of various machines is shown in following equation 1

(1)

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 139

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. TURING MACHINE A Turing machine is a mathematical model of machine or computer that describes an intellectual machine for any problems. The machine handles finite symbols on a tape according to rules[13]. A Turing machine computes algorithms constructed due to the model's simplicity. The machine contains tape is an infinite length to both sides which is divided into small squares is known as a cell. Each cell contains only one symbol for a finite alphabet. The empty cells are filled with blank symbols. A head is used to read and write symbols on the tape and set the movement to the first symbol of left. The machine can move one cell at a time to left, right or no movement. Finite states are stored in state register by Turing machine. The state register is initialized by the special start state. A finite table of rules is used to read the current input symbols from tape and modify it by moving tape head left, right or no movement[14-15]. The Turing machine is a mathematical model of machine or computer that mechanically operates on a tape as shown in the following figure Fig.2. A Turing machine consists of: 1) Input Tape: A tape is infinite to both sides and divided into cells. Each cell contains only one symbol from finite alphabets. At the end, alphabet contains a special symbol known as blank. It is written as 'B'. The tape is implicit to be arbitrarily extendable to both left and right sides for computation. 2) Read/Write Head: A head that can read and write only one symbol at a time on tape and move to the left, right or no movement. 3) Finite State Control: A state control stores the state of the Turing machine from initial to halting state. After reading last symbol Turing machine reaches to final state then the input string is accepted otherwise input string is rejected.

ISSN:0975-887

a

a

b L

b

B

N

R Read / Write Head

Finite State Control

Fig.2: Turing Machine Model

3.1 Mathematical Representation The real machine handled all operations handle by Turing machine with intelligence. A real machine has only limited a finite number of formations. The actually real machine is a linear bounded automata[16]. Due to infinite tape to both sides, Turing machines have an unconstrained amount of storage space to the computations. A Turing machine is represented by 7 tuples[23] i.e. M= (Q, ∑, , δ, q0, B, F) where; Q is a finite set of states ∑ is the finite set of input alphabets  is the finite set of tape alphabets δ is a transition function; δ∶ Q ×  → Q ×  × {L,R,N} where, L: move to the left R: move to the right N: no movement q0 is the initial state B is the blank symbol F is the set of final states or set of halting states.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 140

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3.2 The Language Acceptance The formal language tool is used to apply user-specific constraints for pattern mining. Informal language, we have a need to recognize the category of grammar, but a recognized category of grammar is a difficult task. Turing Machine accepts all types of grammar hence; there is no need to recognize the category of grammar for constraint [19]. The use of Turing Machine for sequential pattern mining is a flexible specification tool. Figure Fig.3 shows acceptance of all types of languages by Turing Machine.

The following figure Fig.4 shows transition diagram for L= {an bn | n>=1}. .

Fig.4: Transition Diagram for L= {an bn | n>=1}

3.4 Transition Rules The definition of Turing machine is represented using tuple format. The machine is represented in mathematical model as follows; M= ({q0, q1, q3, q4}, {0, 1}, {0, 1, B}, δ, q0, B, {q4}) Where, δ (q0, a) = (q1,x,R) δ (q0, y) = (q4,y,N) δ (q1, a) = (q1,a,R) δ (q1, b) = (q1,b,R) Fig.3: Language Acceptance by Turing Machine

δ (q1, y) = (q2,y,L) δ (q1, B) = (q2,B,L)

3.3 Transition Diagram The transition diagram is used to represent the Turing machine computations. The transition rules can also be represented using a state transition diagram. In a state transition diagram circle represents a state, arrows represent transitions between states. Each state transition depends upon current state and current tape symbol and it gives a new state with changing tape symbol and movement. The Java Formal Languages and Automata Package (JFLAP) is used to design a Turing machine for L= {an bn | n>=1}[21].

ISSN:0975-887

δ (q2, b) = (q3,y,L) δ (q3, a) = (q3,a,L) δ (q3, b) = (q3,b,L) δ (q3, x) = (q0,x,R) δ (q0, y) = (q4,y,S) q4 is halting state. 3.5 Instantaneous Description The step by step string processing in the Turing machine is known as an instantaneous description (ID). Turing machine accepts recursively enumerable language is extensible and implemented in

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 141

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

JFLAP [22]. An instantaneous of a Turing machine includes: 1) time

The input string at any point of

2)

Position of head

3)

State of the machine

The string a1a2…...ai-1 ai ai+1……an give the snapshot of the machine in which; 1)

q is the state of Turing machine.

2)

The head is scanning the symbol ai.

The instantaneous description of scanning symbol ai and machine in state q is shown in following figure Fig.5.

Fig.6: Instantaneous Descriptions for String a4b4

3.6 Grammar Representation The Turing machine accepts all types of grammar. The Unrestricted grammar for Turing machine L= {an bn | n>=1} is as shown in following table 2. Table 2. Unrestricted Grammar for L= {an bn | n>=1} Fig.5: Instantaneous Description

The instantaneous Description for string a4b4 is shown in following figure Fig.6.

3.7 String Acceptance The String acceptance or rejection is shown with the help of JFLAP tool. The string acceptance for input string aaaabbbbB is illustrated by JFLAP. The following figure Fig.7 shows acceptance of the string. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 142

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

computation, it has led to new mathematical investigations. The development of the last 20 years is that of categorizing diverse problems in terms of their complexity. It gives a platformindependent approach of determining the complexity. Nowadays computer can be used to pretend the process of a Turing machine, which is seen on the screen. It can have numerous applications such as enumerator, function computer. The Turing machine is core part of Artificial Intelligence.

Fig.7: Acceptance for String aaaabbbbB

The string rejection for input string aaaabbbB is illustrated by JFLAP in the following figure Fig.8.

5. ACKNOWLEDGMENTS This research is a part of my research entitled ―Sequential Pattern Mining using Turing Machine‖. We thank Dr. Arjun P. Ghatule for his help and for the discussions on the topics in this paper. I also thanks to Dr. Kailas J. Karande, Principal, SKN Sinhgad College of Engineering, Pandharpur for his help and discussions on topics of this paper. The paper is partially supported by Sinhgad Institute of Computer Sciences, Pandharpur of Solapur University,Solapur(MS), India. REFERENCES [1] Teuscher and Christof, "Alan Turing: Life and

Legacy of a Great Thinker", Springer, ISBN 978-3-662-05642-4. [2] Guy

Fig.8: Rejection for String aaaabbbbB

Avraham, Ilana Nisky, Hugo L. Fernandes, Daniel E. Acuna, Konrad P. Kording, Gerald E.Loeb, and Amir Karniel, "Towards Perceiving Robots as Humans: Three Handshake Models Face the Turing-like Handshake Test", Revised manuscript received, IEEE,2012.

[3] Stuart Shieber, "The Turing Test: Verbal

4. CONCLUSION The Turing Machine is the most comprehensive, deep, and accessible model of computation existent and its allied theories consent many ideas involving time and cost complexity to be gainfully deliberated. In providing a sort of atomic structure for the concept of ISSN:0975-887

Behavior as the Hallmark of Intelligence", MIT Press, Cambridge, ISBN 0-262-69293-7, pp. 407-412. [4] Shane T. Mueller, "Is the Turing Test Still

Relevant? A Plan for Developing the Cognitive Decathlon to Test Intelligent Embodied Behavior", Paper submitted to the 19th Midwest Artificial Intelligence and Cognitive Science Conference, 2008, pp.1-8. [5] Roman V. Yampolskiy, "Turing Test as a

Defining

Feature

of

AI-Completeness",

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 143

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Springer-Verlag pp.3-17.

Berlin

Heidelberg,

2012,

[16] K. L. P. Mishra and N. Chandrasekaran,

[6] Saul Traiger, "Making the Right Identification

"Theory of Computer Science: Automata, Languages and Computation", Prentice Hall of India Private Limited, New Delhi-110001, 2007.

in the Turing Test", Minds and Machines, Kluwer Academic Publishers, Netherlands, 2000, pp. 561-572. [7] Ulrich J. Pfeiffer, Bert Timmermans, Gary

Bente, Kai Vogeley and Leonhard Schilbach, "A Non-Verbal Turing Test: Differentiating Mind from Machine in Gaze-Based Social Interaction", Plos One, Volume 6, Issue 11, 2011, pp.1-12. [8] Jan van Leeuwen and

Jiri Wiedermann, "Question Answering and Cognitive Automata with Background Intelligence", This Research was Partially Supported by RVO 67985807 and GA CR grant No. 15-04960S, pp. 1-15.

[9] John F. Stins and Steven Laureys, "Thought

translation, tennis and Turing tests in the vegetative state", Springer, 2009, pp. 1-10. [10] Kirkpatrick B. and Klingner B., "Turing‘s

Imitation Game: a discussion with the benefit of hind-sight", Berkeley Computer Science course ―Reading the Classics‖, 2004, pp. 1-5. [11] Philip Hingston, "A Turing Test for Computer

Game Bots", IEEE Transactions on Computational Intelligence and AI in Games, Volume 1, NO. 3, 2009, pp. 169-186. [12] Ayse Pinar Saygin, Ilyas Cicekli and Varol

Akman, "Turing Test: 50 Years Later", Minds and Machines, 2000, pp.463–518. [13] John E. Hopcroft, Rajeev Motwani and Jeffrey

D. Ullman, ―Automata Theory, Language, and Computation‖, Delhi: Pearson, 2008. [14] Vivek Kulkarni, ―Theory of Computation‖,

Pune: Tech-Max, 2007. [15] Dilip

Kumar Sultania, ―Theory Computation‖, Pune: Tech-Max, 2010.

ISSN:0975-887

of

[17] Tirtharaj

Dash and Tanistha Nayak, "Comparative Analysis on Turing Machine and Quantum Turing Machine", Journal of Global Research in Computer Science, ISSN2229-371X, Volume 3, No. 5, 2012, pp.51-56

[18] Amandeep Kaur, "Enigmatic Power of Turing

Machines: A Review", International Journal of Computer Science & Engineering Technology (IJCSET), ISSN: 2229-3345, Volume 6, No., 2015, pp. 427-430. [19] Gerhard Jager and James Rogers, ―Formal

language theory: refining the Chomsky hierarchy‖, Philos Trans R Soc Lond B Biol Sci., 2012, pp.1956–1970. [20] Nazir Ahmad Zafar and Fawaz Alsaade,

"Syntax-Tree Regular Expression Based DFA Formal Construction", Intelligent Information Management, 2012, pp. 138-146. [21] JFAP

Tool for Simulating Results and Validation. [Online]. Available: http://www.jflap.org.

[22] Ankur

Singh and Jainendra Singh, "Implementation of Recursively Enumerable Languages using Universal Turing Machine in JFLAP", International Journal of Information and Computation Technology, ISSN 09742239 Volume 4, Number 1, 2014, pp. 79-84.

[23] Tulashiram B. Pisal and Dr. Arjun P. Ghatule,

―Implicit Conversion of Deterministic Finite Automata to Turing Machine‖, ―International Journal of Innovations & Advancement in Computer Science (IJIACS)‖, ISSN 2347 – 8616 Volume 7, Issue 3 March 2018,pp.606616.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 144

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

A SURVEY ON EMOTION RECOGNITION BETWEEN POMS AND GAUSSIAN NAÏVE BAYES ALGORITHM USING TWITTER API Darshan Vallur1, Prathamesh Kulkarni2, Suraj Kenjale3, Shubham Shinde4 1,2,3,4

Smt Kashibai Navale College of Engineering,Pune,India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT The analysis of social networks is a very tough research area while a fundamental element concerns the detection of user communities. The existing work of emotion recognition on Twitter specifically relies on the use of lexicons and simple classifiers on bag-of words models. The vital question of our observation is whether or not we will increase their overall performance using machine learning algorithms. The novel algorithm a Profile of Mood States (POMS) represents twelve-dimensional mood state representation using 65 adjectives with the combination of Ekman‘s and Plutchik‘s emotions categories like, joy, anger, depression, fatigue, vigour, tension, confusion, disgust, fear, trust, surprise and anticipation. These emotions recognize with the help of text based bag-of-words and LSI algorithms. The contribution work is to cover machine learning algorithm for emotion classification, it takes less time consumption without interfere human labeling. The Gaussian Naïve Bayes classifier works on testing dataset with help of huge amount of training dataset. Measure the performance of POMS & Gaussian Naïve Bayes algorithms on Twitter API. The experimental outcome shows with the help of Emojis for emotion recognition using tweet contents. Keywords- Emotion Recognition, Text Mining, Twitter, LSI, Recurrent Neural Networks, Convolutional Neural Networks, Gaussian Naïve Bayes Classifier 1.

INTRODUCTION

Emotions can be defined as conscious affect attitudes, which constitute the display of a feeling. In recent years, a large number of studies have focused on emotion detection using opinion mining on social media. Due to some intrinsic characteristics of the texts produced on social media sites, such as the limited length and casual expression, emotion recognition on them is a challenging task. ISSN:0975-887

Previous studies mainly focus on lexiconbased and machine learning based methods. The performance of lexiconbased methods relies heavily on the quality of emotion lexicon and the performance of machine learning methods relies heavily on the features. Therefore, we work with three classifications that are the most popular, and have also been used before by the researchers from computational linguistics and natural language processing

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 145

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

(NLP). Paul Ekman defined six basic emotions by studying facial expressions. Robert Plutchik extended Ekman‘s categorization with two additional emotions and presented his categorization in a wheel of emotions. Finally, Profile of Mood States (POMS) is a psychological instrument that defines a six-dimensional mood state representation using text mining. The novel algorithm a Profile of Mood States (POMS) generating twelvedimensional mood state representation using 65 adjectives with combination of Ekman‘s and Plutchik‘s emotions categories like, anger, depression, fatigue, vigour, tension, confusion, joy, disgust, fear, trust, surprise and anticipation. Previous work generally studied only one emotion classification. Working with multiple classifications simultaneously not only enables performance comparisons between different emotion categorizations on the same type of data, but also allows us to develop a single model for predicting multiple classifications at the same time. Motivation The system developed based on our proposed approach would be able to automatically detect what people feel about their lives from twitter messages. For example, the system can recognize:  percentage of people expressing higher levels of life satisfaction in one group versus another group,  percentage of people who feel happy and cheerful,  percentage of people who feel calm and peaceful, and  percentage of people expressing higher levels of anxiety or depression. 2. RELATED WORK

In [1] paper, investigate whether public mood as measured from large-scale collection of tweets posted on twitter.com is correlated or even predictive of DJIA values. The results show that changes in the public mood state can indeed be tracked from the content of large-scale Twitter feeds by means of rather simple ISSN:0975-887

text processing techniques and that such changes respond to a variety of sociocultural drivers in a highly differentiated manner. Advantages are: Increases the performance. Public mood analysis from Twitter feeds offers an automatic, fast, free and large-scale addition to this toolkit that may be optimized to measure a variety of dimensions of the public mood state. Disadvantages are: It avoids geographical and cultural sampling errors. In [2] paper explored an application of deep recurrent neural networks to the task of sentence-level opinion expression extraction. DSEs (direct subjective expressions) consist of explicit mentions of private states or speech events expressing private states; and ESEs (expressive subjective expressions) consist of expressions that indicate sentiment, emotion, etc., without explicitly conveying them. Advantages are: Deep RNNs outperformed previous (semi)CRF baselines; achieving new stateof-the-art results for fine-grained on opinion expression extraction. Disadvantages are: RNNs do not have access to any features other than word vectors. In [3] paper analyze electoral tweets for more subtly expressed information such as sentiment (positive or negative), the emotion (joy, sadness, anger, etc.), the purpose or intent behind the tweet (to point out a mistake, to support, to ridicule, etc.), and the style of the tweet (simple statement, sarcasm, hyperbole, etc.). There are two sections: on annotating text for sentiment, emotion, style, and categories such as purpose, and on automatic classifiers for detecting these categories. Advantages are: Using a multitude of custom engineered features like those concerning emoticons, punctuation, elongated words and negation along with unigrams, bigrams and emotion lexicons features, the SVM classifier achieved a higher accuracy. Automatically classify tweets into eleven categories of emotions. Disadvantages are: Does not

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 146

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

summarize tweets. Does not automatically identifying other semantic roles of emotions such as degree, reason, and empathy target. In [4] article, show that emotionword hashtags are good manual labels of emotions in tweets. Proposes a method to generate a large lexicon of word–emotion associations from this emotion-labeled tweet corpus. This is the first lexicon with real-valued word–emotion association scores. Advantages are: Using hashtagged tweets can collect large amounts of labeled data for any emotion that is used as a hashtag by tweeters. The hashtag emotion lexicon is performed significantly better than those that used the manually created WordNet affect lexicon. Automatically detecting personality from text. Disadvantages are: This paper works only on given text not synonym of that text. The paper [5] develops a multi-task DNN for learning representations across multiple tasks, not only leveraging large amounts of cross-task data, but also benefiting from a regularization effect that leads to more general representations to help tasks in new domains. A multi-task deep neural network for representation learning, in particular focusing on semantic classification (query classification) and semantic information retrieval (ranking for web search) tasks. Demonstrate strong results on query classification and web search. Advantages are: The MT-DNN robustly outperforms strong baselines across all web search and query classification tasks. Multi-task DNN model successfully combines tasks as disparate as classification and ranking. Disadvantages are: The query classification incorporated either as classification or ranking tasks not comprehensive exploration work. In [6] paper we i) demonstrate how large amounts of social media data can be used for large-scale open-vocabulary personality detection; ii) analyze which features are predictive of which personality dimension; and iii) present a ISSN:0975-887

novel corpus of 1.2M English tweets (1,500 authors) annotated for gender and MBTI. Advantages are: The personality distinctions, namely INTROVERT– EXTROVERT (I–E) and THINKING– FEELING (T–F), can be predicted from social media data with high reliability. The large-scale, open-vocabulary analysis of user attributes can help improve classification accuracy. The paper [7] focuses on studying two fundamental NLP tasks, Discourse Parsing and Sentiment Analysis. The development of three independent recursive neural nets: two for the key subtasks of discourse parsing, namely structure prediction and relation prediction; the third net for sentiment prediction. Advantages are: The latent Discourse features can help boost the performance of a neural sentiment analyzer. Pre-training and the individual models are an order of magnitude faster than the Multi-tasking model. Disadvantages are: Difficult predictions to multi-sentential text. 3. EXISTING SYSTEM

The ability of the human face to communicate emotional states via facial expressions is well known, and past research has established the importance and universality of emotional facial expressions. However, recent evidence has revealed that facial expressions of emotion are most accurately recognized when the perceiver and expresser are from the same cultural in group. Paul Ekman explains facial expressions to define a set of six universally recognizable basic emotions: anger, disgust, fear, joy, sadness and surprise. Robert Plutchik defined a wheellike diagram with a set of eight basic, pairwise contrasting emotions; joy – sadness, trust – disgust, fear – anger and surprise – anticipation. Consider each of these emotions as a separate category, and disregard different levels of intensities that Plutchik defines in his wheel of emotions. Disadvantages:

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 147

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

A. Ekman‘s Facial expressions limitations: Image quality Image quality affects how well facialrecognition algorithms work. The image quality of scanning video is quite low compared with that of a digital camera. 2. Image size When a face-detection algorithm finds a face in an image or in a still from a video capture, the relative size of that face compared with the enrolled image size affects how well the face will be recognized. 3. Face angle The relative angle of the target‘s face influences the recognition score profoundly. When a face is enrolled in the recognition software, usually multiple angles are used (profile, frontal and 45degree are common). 4. Processing and storage Even though high-definition video is quite low in resolution when compared with digital camera images, it still occupies significant amounts of disk space. Processing every frame of video is an enormous undertaking, so usually only a fraction (10 percent to 25 percent) is actually run through a recognition system. B. Plutchik‘s algorithm limitations: 1. The FPGA Kit uses hardware that is expensive. Thus, making this approach a cost ineffective technological solution. 2. Also, there is an additional dimension which involves a lot of tedious calculations. 4. SYSTEM OVERVIEW

Profile of Mood States is a psychological instrument for assessing the individual‘s mood state. It defines 65 adjectives that are rated by the subject on the five-point scale. Each adjective contributes to one of the six categories. For example, feeling annoyed will positively contribute to the anger category. The higher the score for the adjective, the more it contributes to the overall score for its category, except for relaxed and efficient whose contributions to their respective categories are negative. POMS combines these ratings into a sixISSN:0975-887

dimensional mood state representation consisting of categories: anger, depression, fatigue, vigour, tension and confusion. Comparing to the original structure, we discarded the adjective blue, since it only rarely corresponds to an emotion and not a color, and word-sense disambiguation tools were unsuccessful at distinguishing between the two meanings. We also removed adjectives relaxed and efficient, which have negative contributions, since the tweets containing them would represent counter-examples for their corresponding category.

Fig. 1 System Architecture

Contribution of this paper is to implement the novel algorithm a Profile of Mood States (POMS) generating twelvedimensional mood state representation using 65 adjectives with combination of Ekman‘s and Plutchik‘s emotions categories like, joy, anger, depression, fatigue, vigour, tension, confusion, disgust, fear, trust, surprise and anticipation. The machine learning algorithm gives less time consumption without interfere human labeling. The Gaussian Naïve Bayes classifier works on testing dataset with help of huge amount of training dataset. It gives same result as POMS tagging

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 148

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

methods. The contribution work is prediction of Emojis for emotion recognition using tweet contents. 5. MATHEMATICAL MODEL

5.1 Set Theory Let us consider S as a system for Emotion recognition system S= {…… INPUT:  Identify the inputs F= {f1, f2, f3 ....., fn| ‗F‘ as set of functions to execute commands.} I= {i1, i2, i3…|‘I‘ sets of inputs to the function set} O= {o1, o2, o3….|‘O‘ Set of outputs from the function sets} S = {I, F, O} I = {Comments or tweets submitted by the user ...} O = {Detect emotions of the users and finally display tweets...} F={ Tweet extraction, Generate Trainingset, Tweet processing, Keywords extraction Tweet Classification, Emotional tweet detection, Broadcasting tweet review } 5.2 Latent Dirichlet Allocation (LDA) Algorithm First and foremost, LDA provides a generative model that describes how the documents in a dataset were created. In this context, a dataset is a collection of D documents. Document is a collection of words. So our generative model describes how each document obtains its words. Initially, let‘s assume we know K topic distributions for our dataset, meaning K multinomials containing V elements each, where V is the number of terms in our corpus. Let βi represent the multinomial for the ith topic, where the size of βi is V: |βi|=V. Given these distributions, the LDA generative process is as follows: Steps: 1. For each document: ISSN:0975-887

(a) Randomly choose a distribution over topics (a multinomial of length K) (b) for each word in the document: (i) Probabilistically draw one of the K topics from the distribution over topics obtained in (a), say topic βj (ii) Probabilistically draw one of the V words from βj 6. CONCLUSION

This project implements a novel algorithm Profile of Mood States (POMS) represents twelve-dimensional mood state representation using 65 adjectives with combination of Ekman‘s and Plutchik‘s emotions categories like, joy, anger, depression, fatigue, vigour, tension, confusion, disgust, fear, trust, surprise and anticipation. These POMS classifies the emotions with the help of bag-of-words and LSI algorithm. The machine learning Gaussian Naïve Bayes classifier is used to classify emotions, which gives results as accurate and less time consumption compares to POMS. REFERENCES [1] J. Bollen, H. Mao, and X.-J. Zeng, ―Twitter mood predicts the stock market,‖ J. of Computational Science, vol. 2, no. 1, pp. 1–8, 2011. [2] O. Irsoy and C. Cardie, ―Opinion Mining with Deep Recurrent Neural Networks,‖ in Proc. of the Conf. on Empirical Methods in Natural Language Processing. ACL, 2014, pp. 720– 728. [3] S. M. Mohammad, X. Zhu, S. Kiritchenko, and J. Martin, ―Sentiment, emotion, purpose, and style in electoral tweets,‖ Information Processing and Management, vol. 51, no. 4, pp. 480–499, 2015. [4] S. M. Mohammad and S. Kiritchenko, ―Using Hashtags to Capture Fine Emotion Categories from Tweets,‖ Computational Intelligence, vol. 31, no. 2, pp. 301–326, 2015. [5] X. Liu, J. Gao, X. He, L. Deng, K. Duh, and Y.-Y. Wang, ―Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval,‖ Proc. of the 2015 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 912–921, 2015. [6] B. Plank and D. Hovy, ―Personality Traits on

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 149

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Twitter —or— How to Get 1,500 Personality Tests in a Week,‖ in Proc. of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2015, pp. 92–98.

[7] B. Nejat, G. Carenini, and R. Ng, ―Exploring Joint Neural Model for Sentence Level Discourse Parsing and Sentiment Analysis,‖ Proc. of the SIGDIAL 2017 Conf., no. August, pp. 289–298, 2017.

ANTI DEPRESSION CHATBOT IN JAVA

Manas Mamidwar1, Ameya Marathe2, Ishan Mehendale3, Abdullah Pothiyawala4, Prof. A. A. Deshmukh5 1,2,3,4,5

Department of Computer Engineering, SKNCOE, Pune 411041, Savitribai Phule Pune University, Pune [email protected], [email protected], [email protected], [email protected], [email protected]

1. INTRODUCTION The steps taken by students in their earlier learning years shape up their future. There is a lot of pressure on them from their parents or peers to perform well. This might lead to extreme levels of depression which might take a toll on their health. So, we decided to design a web app to help the students to cope up with the stress. We are going to make a better app than those which are previously available. This chatbot helps to cope with the pressure of studies for students within a range of 14 to 22 years. The bot can determine the stress or depression level using a simple questionnaire at start and advances to better assess the situation in later stages. General Terms Depression, Depression level, Stanford CoreNLP Keywords Chatbot

2. MOTIVATION The steps taken by students in their earlier learning years shape up their future. There is a lot of pressure on them from their parents or peers to perform well. This might lead to extreme levels of depression which might take a toll on their health. So, we decided to design a web app to help the students to cope up with the stress. We are going to make a better app than those which are previously available. 3. PROBLEM STATEMENT Create a chatbot to help with coping up with the pressure of studies for students within a range of 14 to 22 years. The bot can determine the stress or depression level using a simple questionnaire at start and advances to better assess the situation in later stages. Also to help sports people to balance their play and studies.

4. STATE OF ART Table 1 State of art

Sr. Name of the Paper No. 1. The chatbot Feels You – A Counseling Service Using Emotional Response Generation ISSN:0975-887

Excerpts This paper uses DNN for context and emotion recognition to generate an appropriate response by recognizing the best suited reaction.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 150

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

2. 3.

4.

5.

Speech Analysis and Depression Affective and Content Analysis of Online Depression Communities

Formant and jitter frequencies in speech are calculated based upon which a depression level is determined. Linguistic Inquiry and Word Count (LIWC) is used for depression recognition. A survey is conducted of various Clinical and Control communities for better understanding depression patterns. Detection of Depression in Survey based paper, volunteers are required to speak Speech on certain questions, stories and visual images and using feature selection, facilitate the depression recognition. A Model For Prediction Of Here about 500 records have been taken as test data Human Depression Using for the model. The model is tested on 500 individuals Apriori Algorithm and successfully predicted the percent of individuals are suffering depression. Following factors of depression are considered: Lifestyle, Life events, Non-psychiatric illness, Acquired infection, Medical treatments, Professional activities, Stress, and Relationship Status etc. The questions were based on Family problem(FA), Financial problem(FP),Unemployed(UE), enumeration (REM),Addiction(ADD),Workplace(ORG), Relationship(RL),Congenital diseases(CD), Apprehension(AP),Hallucination(HL), and Sleeping problem(SLP).

6.

Clinical analysis Features

7.

Internet Improves Health The paper suggests some websites where solution to Outcomes in Depression their problems can be found. It is a kind of self-help. The model uses theory of behavior change.

8.

Detecting Depression Using Multimodal Approach of Emotion Recognition Classification of depression state based on articulatory precision

9.

10.

Depression The speech of the person who is depressed is recorded Using Speech by one of the family members of the person or his/her friend. Using the linear features of the speech, the model is able to calculate the depression level of the person.

In this, there are various ways to take input, viz., speech input, textual input, etc. 8 emotions are considered and accordingly, an alert send to the doctor. Given that neurophysiological changes due to major depressive disorder influence the articulatory precision of speech production, vocal tract formant frequencies and their velocity and acceleration toward automatic classification of depression state were investigated. Predicting anxiety and The model uses ten machine learning algorithms like depression in elderly Naïve Bayes, Random Forest, Bayesian Network, K patients using machine star, etc. to classify the patients whether they have learning technology depression or not. Out of these ten algorithms, the best

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 151

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

one is chosen using the confusion matrix.

5. GAP ANALYSIS

Table 1 Gap Analysis

Sr. Name of the No. Paper 1. The chatbot Feels You – A Counseling Service Using Emotional Response Generation 2. Speech Analysis and Depression

3.

Affective and Content Analysis of Online Depression Communities

4.

Detection Depression Speech

5.

A Model For Prediction Of Human Depression Using Apriori Algorithm

ISSN:0975-887

of in

Excerpts

Differentiating points

This paper uses DNN for context and Our project focuses emotion recognition to generate an on a specific context appropriate response by recognizing the ‖Depression‖ and best suited reaction. gives a specific solution.

Formant and jitter frequencies in speech are The app mentioned in calculated based upon which a depression the paper is android level is determined. exclusive, whereas we are planning to create a web application. We are also going to provide a solution along with depression level calculation which the android app does not provide. Linguistic Inquiry and Word Count The paper just (LIWC) is used for depression recognition. provides a way of A survey is conducted of various Clinical detecting depression. and Control communities for better Our app detects and understanding depression patterns. quantifies depression and gives satisfactory solution for the same. Survey based paper, volunteers are required Limited questions, no to speak on certain questions, stories and solution provided, visual images and using feature selection, unable to recognize facilitate the depression recognition. root cause of depression, while our app does the above mentioned things. Here about 500 records have been taken as Only able to detect test data for the model. The model is tested depression level. No on 500 individuals and successfully solutions are predicted the percent of individuals are provided. Apriori suffering depression. Following factors of algorithm has its own depression are considered: Lifestyle, Life disadvantages.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 152

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

6.

events, Non-psychiatric illness, Acquired infection, Medical treatments, Professional activities, Stress, and Relationship Status etc. The questions were based on Family problem(FA), Financial problem(FP),Unemployed(UE), enumeration (REM),Addiction(ADD),Workplace(ORG), Relationship(RL),Congenital diseases(CD), Apprehension(AP),Hallucination(HL), and Sleeping problem(SLP). Clinical The speech of the person who is depressed Depression is recorded by one of the family members analysis Using of the person or his/her friend. Using the Speech Features linear features of the speech, the model is able to calculate the depression level of the person.

No solution is provided, only depression level is determined. To use the model, the person who is depressed has to depend on another person. In our app, the person himself is interacting with the system.

7.

Internet Improves Health Outcomes in Depression

The paper suggests some websites where solution to their problems can be found. It is a kind of self-help. The model uses theory of behavior change.

The websites provide only a generalized solution, not a specific solution to the problem. We are giving specific solution to the problem.

8.

Detecting Depression Using Multimodal Approach Emotion Recognition

In this, there are various ways to take input, viz., speech input ,textual input,etc. 8 emotions are considered and accordingly, an alert send to the doctor.

The model is not useful when someone goes into depression. It only suggests preventive measures. While our app suggests preventive measures as well as the solutions when the person has gone into depression. If the person has depression ,then an immediate alert is sent to the doctor, but if the user is not comfortable to talk with the doctor, then

9.

of

Classification of depression state based on articulatory precision

ISSN:0975-887

Given that neurophysiological changes due to major depressive disorder influence the articulatory precision of speech production, vocal tract formant frequencies and their velocity and acceleration toward automatic classification of depression state were investigated.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 153

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

10.

Predicting anxiety and depression in elderly patients using machine learning technology

The model uses ten machine learning algorithms like Naïve Bayes, Random Forest, Bayesian Network, K star, etc. to classify the patients whether they have depression or not. Out of these ten algorithms, the best one is chosen using the confusion matrix.

His/her depression will not get treated. But, in our app we provide the solution as well as if the person is in severe depression, we encourage the user to seek help from the doctor. The time spent on determining the best algorithm to predict is a lot. Also no solution is provided. Our application is fast and also provides the solution.

6. PROPOSED WORK

Fig 5.1 Proposed Architecture

1. First the user if not already registered in the system has to sign up. The signup stage is foolproof and is secured with an OTP verification stage. 2 After the Signup step the user is taken to the login page. After login on the first ISSN:0975-887

attempt he/she is given a text area to write his mental state upon which a specialized questionnaire with respect to his/her depression level is provided.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 154

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. There are basically 3 levels of Depression going from 1 to 3 according to the increasing severity. 4. The first two levels are considered as curable with our app itself. Here an option for chatbot is provided which is available 24/7. There are two types of students who can use the app (sports and regular). The chatbot is provided for a regular student. A messenger is created for the sports student

where he/she will be provided with a token and can contact the admin who has experience in dealing with the sports and study stress. 5. In case of a very severe condition the contact details of a renowned psychiatrist will be provided. The app generates reminders after specific intervals just to check the progress of student after some remedies have been incorporated by them.

Fig 5.2 Activity Flow Diagram

Formulae 1. Each node is assigned a label via: ...IV where Ws 2 R5d is the sentiment classification matrix. 2. The error function of a sentence is:

  

...I h 2 Rd: output of the tensor product V [1:d] 2 R2d2dd: tensor that defines multiple bilinear forms. V [i] 2 R2d2d: each slice of V [1:d]

…II

...V where = (V;W;Ws;L) 5.1.3 Working First we will provide text area in which user has to Express his/her condition. Then the function will be executed on this text area which will split all the sentences present in the text area. Then this function will return the number of sentences and array of sentences. Stanford CoreNLP will be applied on this array of sentences to compute the sentiment level of each sentence. If any one of the sentence‘s sentiment level returns 1(Negative) then the sentiment level of complete text area will be 1.

...III ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 155

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

If the number of sentences with sentiment level 2(Neutral) is greater than or equal to the number of sentences with sentiment level 3(Positive) then sentiment level of complete text area will be 2. Else the sentiment level of complete text area will be 3. Depending on the sentiment level of text area, question set(10 questions) will be provided(Except for sentiment level 3). Condition 1:Sentiment level=1 1. Out of 10 questions, 4 questions will be provided which will strictly focus on whether the user is going to harm himself/herself or not. 2. If answer of any of these 4 questions is yes then the depression level is determined as 3. 3. If out of the remaining 6 questions user answers atleast 4 Questions as Yes then the depression level is determined as 2. 4. Otherwise the depression level will be 1. Condition 2:- Sentiment level=2 1. Out of 10 questions, 4 questions will be provided which will strictly focus on whether the user is going to harm himself/herself or not. 2. If answer of any of these 4 questions is yes then the depression level is determined as 3. 3. If out of the remaining 6 questions user answers at least 5 Questions as Yes then the depression level is determined as 2. 4. Otherwise the depression level will be 1.

ISSN:0975-887

Condition 3:- Sentiment level=3 Only basic solution will be provided. 7. CONCLUSION AND FUTURE WORK We are emotional beings looking for context, relevance and connection in a technology ridden world. And nothing better than the very technology enhancing human interactions and easing out our tasks, right? That‘s the reason why by deep diving into the status quo of the AI driven market in particular, we find a vested interest in the development of conversational UIs. It comes as a nowonder for such is the penetration of chat as a medium of conversation today. Chatbots learn to do new things by trawling through a huge swath of information. They are designed to spot patterns and repeat actions associated with them when triggered by keywords, phrases or other stimuli. They seem clever, but they are not. They are adaptive & predictive in their learning curve. This means that if the input is poor, or repeats questionable statements, the chatbots behavior will evolve accordingly. Anti-Depression chatbots would help the depressed people to communicate more efficiently with the psychiatrists and find a solution to their problems. A hospital can have its own antidepression chatbot so that more patients get covered. If the chatbot can identify various languages, then it will be more efficient. These chatbots would really help teenagers who are regular students and who also play sports as the depression problem of the teenagers is not taken seriously. Many teenagers are afraid to talk to their parents about their current difficult situation. So, these antidepression chatbots would help these students a lot. Anti-depression chatbots must be installed as a built-in app in all mobile phones. As advancements happen in the field of Artificial Intelligence,

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 156

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

these anti-depression chatbots would become more efficient. Anti-Depression chatbot can be used by professional sports players. There is a lot of pressure on the sports players particularly when the fail. They need to find some way, some path to the top again and these chatbots can help a lot. As we keep bettering the underlying technology through trial and error, NLP will grow more efficient, capable of handling more complex commands and delivering more poignant outputs. chatbots will also be able to have multilinguistic conversations, not only understanding hybrid languages like ‗Hinglish‘ (Hindi crossed with English) with NLU, but with advanced NLG, will also be able to reciprocate in kind. On a conversational space, the users enjoy the freedom to input their thoughts seamlessly. Meaning, be it an enquiry related to a service being provided or a query of help, the users receive an instant reply which provides them a sense of direction inside the app.This app is the best line of defense against a varying range of depression also for a wide range of ages. The app can detect, measure and cure depression. The app will help a huge population to cope with the increasing stress that is gripping the society. Thus, the contribution of this app towards society is immense. REFERENCES [1] Culjak, M. Spranca. Internet Improves Health Outcomes in Depression. Proceedings of the 39th annual Hawaii international conference on system science, 2006, pp. 1 – 9 [2] Imen Tayari Meftah, Nhan Le Thanh, Chokri Ben Amar. Detecting Depression Using Multimodal Approach of Emotion Recognition. GLOBAL HEALTH 2012 : The First International Conference on Global Health Challenges. [3] Shamla Mantri, Dr. Pankaj Agrawal, Dr. S.S.Dorle, Dipti Patil, Dr. V.M.Wadhai. Clinical Depression analysis Using Speech Features. 2013 Sixth International Conference on Emerging Trends in Engineering and Technology ISSN:0975-887

[4] Brian S. Helfer, Thomas F. Quatieri, James R. Williamson, Daryush D. Mehta, Rachelle Horwitz, Bea Yu . Classification of depression state based on articulatory precision. Interspeech 2013 [5] Lambodar Jena, Narendra K. Kamila. A Model for Prediction Of Human Depression Using Apriori Algorithm. 2014 International Conference on Information Technology. [6] Thin Nguyen, Dinh Phung, Bo Dao, Svetha Venkatesh, Michael Berk. Affective and Content Analysis of Online Depression Communities. 08 April 2014, IEEE Transactions on Affective Computing(Volume: 5, Issue: 3, July-Sept. 1 2014) [7] Zhenyu Liu, Bin Hu*, Lihua Yan, Tianyang Wang, Fei Liu, Xiaoyu Li, Huanyu Kang. Detection of Depression in Speech. 2015 International Conference on Affective Computing and Intelligent Interaction (ACII). [8] Tan Tze Ern Shannon, Dai Jingwen Annie and See Swee Lan. Speech Analysis and Depression. 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA) [9] Dongkeon Lee, Kyo-Joong Oh, Ho-Jin Choi. The chatbot Feels You –A Counseling Service Using Emotional Response Generation. 2017 IEEE International Conference on Big Data and Smart Computing (BigComp) [10] Arkaprabha Sau, Ishita Bhakta. Predicting anxiety and depression in elderly patients using machine learning technology.(Volume: 4, Issue: 6, 12 2017) [11] Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank; Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng and Christopher Potts; 2013; Stanford University.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 157

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

EMOTION ANALYSIS ON SOCIAL MEDIA PLATFORM USING MACHINE LEARNING Shreyas Bakshetti1, Pratik Gugale2, Jayesh Birari3, Sohail Shaikh4 1,2,3,4

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering [email protected], [email protected], [email protected], [email protected]

ABSTRACT Social media has become source to various kinds of information now-a-days. Analyzing this huge volume of user-generated data on social media can provide useful information for understanding people‘s emotions as well as the general attitude and mood of the public. Sentiment analysis also known as opinion mining is a part of data mining that deals with classifying text expressed in natural language into different classes of emotions. In this paper, we present a framework for sentiment analysis of twitter data using machine learning. Keywords Sentiment Analysis, Machine learning, Ensemble approach be a combination of text, symbols, 1. INTRODUCTION emoticons and images as well. A lot of Over the last years the rise of social media times these tweets as used by the people to has changed completely the way of express their views on various topics and communication and they provide new interact with other users and understand means that connect in real time people all their views. Sentiment analysis presents an over the globe with information, news and opportunity to organizations with political, events. Social media have changed social and economic interests to completely the role of the users and have understand the mood of people on various transformed them from simple passive topics. information seekers and consumers to In this work, we present a framework for active producers. With the wide-spread understanding and then representing public usage of social media, people have become attitude/mood expressed by the users using more and more habitual in expressing their the social media platform twitter. The data opinions on web regarding almost all required for this purpose will be extracted aspects of everyday. Every day, a vast using the application programming amount of heterogeneous big social data is interface (API) provided by twitter. The generated in various social media and extracted data will be applied upon by networks. This vast amount of textual data some pre-processing techniques that will necessitates automated methods to analyze help select only the parts of the text that and extract knowledge from it. actually express emotions. This will be A big contributor to this large amount of then followed by feature selection which social data is the widely-used social media will be used to build the classifiers. platform Twitter. It is a platform where Finally, the classifiers will be used to label users of the platform interact using the data into polarities that is positive or messages called ―tweets‖. These tweets negative. can be simple textual sentences or they can ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 158

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

2. RELATED WORK Emotion detection methods can be divided into 2 main parts; lexicon-based methodologies and machine learning methods. Lexicon based methods use lexical resources to recognize sentiments in text. This approach is basically a keyword based approach where every word in the text is compared to dictionaries of words that contain words expressing emotions or sentiments. In this paper [1] use of a lexicon-based approach to analyze basic emotions in Bengali text is done. A model was presented to extract emotion from Bengali text at the sentence level. In order to detect emotion from Bengali text, the study considered two basic emotion ‘happiness‘ and ‘sadness‘. The proposed model detected emotions on the basis of the sentiment of each sentence associated to it. The other method is using various machine learning algorithms to build classifiers that will help in the process of sentiment analysis. Machine learning also contains two different types of techniques: supervised and unsupervised which can both be used for sentiment analysis. But mostly the supervised techniques are used for sentiment analysis. In the work presented by this paper [2] linear regression which is a supervised machine learning technique has been used to for the purpose of sentiment analysis. In another work done in the paper [3] classification algorithms such as Naïve Bayes multinomial (NBM), Sequential minimal optimization (SMO), Compliment Naïve Bayes (CNB) and Composite hypercubes on iterated random projections (CHIRP) were used for classification. The Naïve Bayes multinomial (which is a variation of naïve Bayes) gave the highest accuracy. The author in paper [4] explored machine learning approaches with different feature selection schemes, to identify the best possible approach and found that the classification using high information features, resulted in more accuracy than ISSN:0975-887

Bigram Collocation. They also proposed that there was a scope for improvement using hybrid techniques with various classification algorithms. The paper [5] proposed a system using Naive Bayes (NB) and Maximum Entropy (ME) methods to the same dataset which worked very well with the high level of accuracy and precision. The work in [6] presented a survey on different classification algorithms (NB, KNN, SVM, DT, and Regression). Authors found that almost all classification techniques were suited to the characteristics of text data. Use of neural networks has started to increase in sentiment analysis in recent times. The authors in their work in paper [7] have compared the performances of CNN (Convolutional Neural Networks) and combination of CNN and SVM (a supervised technique) and found out that the performance of the combination is much higher than only of CNN. In this paper we are going to use machine learning approach as it is better as compared to the lexicon-based approach. This paper also seeks to improve the previous works by using the ensemble technique for building the classifiers which is bound to show great improvement in performance for sentiment analysis. 3. PROPOSED SYSTEM The aim of our system is to develop a framework to display the emotions of the public regarding any particular topic. To do this, we will be building an application that can be given an input (which will be the topic regarding which the emotions of the public are to be anticipated) and the application after applying pre-processing, feature extraction and classification will display the mood of the public regarding the given topic using graphs and statistics. The social media platform that we are using in our work is twitter, using the application programming interface provided by twitter we are able to extract as many tweets as possible. Once we extract the tweets, we will apply a number

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 159

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

of steps to finally classify the tweets into two labels that are positive and negative, thus expressing the mood of the public regarding the given topic. The most interesting part of our work is the use of ensemble approach in process of classification and classifying the tweets. But before applying the machine learning algorithms it is important that proper preprocessing of the data is done. Once preprocessing is done, it will be followed by feature extraction which will be used for generating feature vectors which will then be used for the purpose of classification. Using graphs and statistics we will also be providing comparison between results obtained using the techniques individually and the results obtained by using the ensemble approach. Pre-processing The Tweets are usually composed of incomplete expressions, or expressions having emoticons in them or having acronyms or special symbols. Such irregular Twitter data will affect the performance of sentiment classification. Prior to feature selection, a series of preprocessing is performed on the tweets to reduce the noise and the irregularities. The preprocessing that will be done is: -Removal of all non-ASCII and nonEnglish characters in the tweets. -Removal of URL links. The URLs do not contain the any useful information for our analysis, so they will be deleted from tweets. -Removal of numbers. The numbers generally do not convey any sentiments, and thus are useless during sentiment analysis and thus are deleted from tweets. -Expand acronyms and slangs to their full words form. Acronyms and slang are common in tweets, but are ill-formed words. It is essential to expand them to their original complete words form for sentiment analysis. -Replace emoticons and emojis. The emoticon expresses the mood of the writer. We replace the emoticons and emoji with

ISSN:0975-887

their origin text form by looking up the emoticon dictionary. NLP and Feature Selection Natural language processing basically includes removal of stop words and stemming of the words after preprocessing. -Stop word Removal: Stop words usually refer to the most common words in a language, such as "the", "an", and "than". The classic method is based on removing the stop words obtained from precompiled lists. There are multiple stop words lists existing in the literature. -Stemming: It refers to replacing of multiple words with same meaning. Example: "played", "playing" and "play" all are replaced with play. The algorithms that will be used for these purposes are described in the further sections of the paper. Finally, the feature selection is done. Vectors of words are created after preprocessing and NLP has been applied on the tweets. These vectors are given to the classifiers for the purpose of classification. Ensemble Approach for Classification In our work we are going to use the ensemble approach for the purpose of classification, that is labelling the tweets into different polarities. This is the most important part of our work as most of the works done previously have used only single machine learning algorithms for the purpose of classification but in this work, we are going to use an Ensemble of three different algorithms to obtain better results in prediction than what could be obtained from any of the learning algorithms alone. The advantage of using the ensemble approach is that is significantly increases the efficiency of classification. One more important thing about using the ensemble approach is the use of right combinations of algorithms. In our work we are going to consider Naïve-Bayes, Random Forest and Support Vector Machine for the ensemble classifier. These algorithms have been selected as they have proven to give the best results when used individually and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 160

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

thus using them in the ensemble will also yield efficient results. The algorithms have been discussed in short in a further section. . 4. SYSTEM ARCHITECTURE The following figure shows the proposed architecture of the system which includes

three main parts: Pre-processing, Feature selection and applying the ensemble classifier to perform sentiment analysis on social media big data and visualization of the results obtains using graphs.

Fig 1: System Architecture

5. ALGORITHMS Algorithms will be used in the preprocessing as well as the classification phase. In pre-processing the algorithms used will be for stemming and stop-word removal. They are described below: NLP Algorithms The heading of subsections should be in Times New Roman 12-point bold with only the initial letters capitalized. (Note: For subsections and subsubsections, a word like the or a is not capitalized unless it is the first word of the header.)

ISSN:0975-887

Stop-Word Removal Algorithm Input: Document D of comments of review file. Output: Stop-word removed comment data. Step 1: The text of input Document D is tokenized and each and every word from D is stored in array. Step2: A single stop word is read from the list of stop-words. Step 3: The stop word that is read from stop-word list is now compared to the word from D using sequential search technique.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 161

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Step 4: If the word matches, then it is removed from the array, and the comparison is continued till all the words from D is compared successfully. Step 5: After successful removal of first stop-word, another stop-word is read from stop-word list and again we continue from step 2. The algorithm runs till all the stopwords are compared successfully. Stemming Algorithm Input: comments after stop-words removing. Output: Stemmed comment data. Step 1: A single comment is read from output of stop-word removing algorithm. Step 2: This is then written into another file at location given and read during stemming process. Step 3: tokenization is applied on selected comment. Step 4: A particular word is processed during stemming in loop and checked if that word or character is null or not. Then that word is converted into lower case and compared with another words in comments. Step 6: If words with similar kind of or meaning are found are stemmed, that is they are reduced to their basic word. After the pre-processing is done the next step will be building the classifier based on the ensemble approach. The following algorithms are being considered for that purpose: Machine Learning Algorithms This section describes the machine learning algorithms that will be used in our work in brief. Naïve Bayes Algorithm Naive Bayes is a simple technique for constructing classifiers: models that assign class labels to which describes the probability of a feature, based on prior knowledge of conditions that might be related to that feature.) problem instances, represented as ISSN:0975-887

vectors of feature values, where the class labels are drawn from some finite set. This classification technique is a probabilistic classification technique which finds the probability of a label belonging to a certain class. (In our case the classes are positive and negative). The algorithm uses the Bayes theorem for the purpose of finding the probabilities. The theorem assumes that the value of any particular feature is independent of the value of any other feature. It is given as: P(A|B) = (P(A)* P(B|A)) / P(B) Support Vector Machine A support vector machine is a supervised technique in machine learning. In this technique every data item is represented as a point in a ndimensional space and hyperplane is constructed that separates the data points into different classes. and then this hyperplane is used for the purpose of classification. The hyperplane will divide the dataset into two different classes positive and negative in our work. A hyperplane having the maximum distance to the nearest training data item of both the classes is considered to be the most appropriate hyperplane. This distance is called margin. In general, the larger is the margin the lesser is the error in classification. Random Forest Random Forest is developed as an ensemble of based on many decision trees. It is basically a combination of many decision trees. In classification procedure, each Decision Tree in the Random Forest classifies an instance and the Random Forest classifier assigns it to the class with most votes from the individual Decision Trees. So basically, each decision tree in the random forest performs classification

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 162

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

on random parts of the dataset and predictions by of all these different trees are aggregated to generate the final results. 6. PERFORMANCE MEASUREMENTS The classification performance will be evaluated in three terms accuracy, recall and precision as defined below. A confusion matrix is used for this. True positive reviews + True Negative reviews Accuracy = ----------------------------------------------------------Total number of

use of machine learning instead of lexiconbased approach is a big plus-point of this work the framework has the potential to outdo the existing systems because of the use of the ensemble approach. It will do the classification on the basis of polarities i.e. positive and negative. Future work can include developing better techniques for visualizing the results. Another possible future work can be classifying the tweets on a range of emotions. Another direction for future work can be using of larger datasets to train the classifiers so as to improve the efficiency of the analysis process. REFERENCES [1] Tapasy Rabeya and Sanjida Ferdous. ―A

documents [2]

True

positive

reviews Recall = ------------------------------------------------------

[3]

True positive reviews + false negative reviews [4]

True positive reviews Precision = --------------------------------------------------------True positive review+ false positive reviews 7. CONCLUSION AND FUTURE WORK A framework is being built that will enhance the existing techniques of sentiment analysis as previous techniques mostly focused on classification of single sentences but the framework, we are building works on huge amounts of data using machine learning techniques. The

ISSN:0975-887

[5] [6]

[7]

Survey on Emotion Detection‖. 2017, 20th International Conference of Computer and Information Technology (ICCIT) Sonia Xylina Mashal, Kavita Asnani in their work ―Emotion Intensity Detection for Social Media Data‖. 2017, International Conference on Computing Methodologies and Communication (ICCMC) Kudakwashe Zvarevashe, Oludayo O Olugbara. "A Framework for Sentiment Analysis with Opinion Mining of Hotel Reviews".2018, Conference on Information Communications Technology and Society (ICTAS) M. Trupthi et al., ―Improved Feature Extraction and Classification - Sentiment Analysis, ―International Conference on Advances in Human Machine Interaction (HMI-2016), March 03-05, 2016, R. L. Jalappa Institute of Technology, Doddaballapur, Bangalore, India. Orestes Apple et al., ―A Hybrid Approach to Sentiment Analysis‖, IEEE, 2016. S. Brindhaet et al., ―A Survey on Classification Techniques for Text Mining‖, 3rd International Conference on Advanced Computing and Communication Systems (ICACCS-2016), Jan. 22-23, 2016, Coimbatore, India. Y Yuling Chen, Zhi Zhang. "Research on text sentiment analysis based on CNNs and SVM". 2018,Conference on Information Communications Technology and Society (ICTAS).

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 163

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

STOCK MARKET PREDICTION USING MACHINE LEARNING TECHNIQUES Rushikesh M. Khamkar, Rushikesh P. Kadam, Moushmi R. Jain, Ashwin Gadupudi Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune [email protected], [email protected], [email protected], [email protected]

ABSTRACT The Stock Market prediction task is interesting as well as divides researchers and academics into two groups those who believe that we can devise mechanisms to predict the market and those who believe that the market is efficient and whenever new information comes up the market absorbs it by correcting itself, thus there is no space for prediction using different Support Vector Machine(SVM), Single Level Perceptron, Multi -Level Perceptron, Radial Bias Function. General Terms Support vector machine, radial basis function, multi-level perceptron, single level perceptron, Machine learning. Keywords: Stock Market, Stock prediction, Machine learning, Classification of Stocks. • NYSE - New York Stock Exchange 1. INTRODUCTION For a new investor, the share market •NASDAQ - National Association of can feel a lot like legalized gambling. Securities Dealers Randomly choose a share based on gut • NSE – National Stock Exchange (India) instinct. If the value of your share goes up • BSE – Bombay Stock Exchange you‘re in profit else you‘re in loss. The share market can be intimidating, but the There is no way to predict the more you learn about shares, and the more accurate trends in stock market. Many you understand the true nature of stock factors affect rises the share prices of market investment, the better and smarter different companies[1]. The best way to you'll manage your money. understand stock markets is to analyze and Terms: study how the market movements have • A stock of a company constitutes the been in the past[2]. equity stake of all shareholders. • A share of stock is literally a share in the Share market trends tend to repeat ownership of a company[1]. When themselves overtime. After you study the investor purchases a share of stock, the cycle of a particular stock, you can make investor is entitled to a small fraction of predictions about how it will change over the assets and profits of that company. the course of time[3]. Some stocks might • Assets include everything the company be truly arbitrary in which case the owns (real estate, equipment, inventory) movement is random but in most of the • Shares in publicly traded companies are cases there is a particular trend that repeats bought and sold at a stock market or a itself. Recognizing these patters will stock exchange. enable you to predict the future trend[1]. These are some examples of popular stock exchanges: ISSN:0975-887

The project goal is to build a system where the machine learning algorithms try to

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 164

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

predict the prices of stocks based on their previous closing prices and other attributes that influence its price like Interest rates, Foreign exchange and Commodity prices[4]. 2. MOTIVATION Stock market movements make headlines every day. In India, 3.23 crore individual investors trade stocks. Maharashtra alone accounts for one-fifth of these investors. However, a report from Trade Brains shows that 90% of these investors lose money in due to various reasons like insufficient research, speculation, trading with emotions etc. Higher inflation rate and lower interest rate makes it ineffective to put one‘s money into savings account or fixed deposits.[5][6] Thus, many people look up to stock market to keep up with the inflation. In this process of multiplying their money many investors have made a fortune while, some have lost a lot of money due to unawareness or lack of time to research about a stock. There are lots of contradicting opinions in the news and an individual may not have the time or may not know how to research about a stock. Most importantly, it is very difficult to manually predict the stocks prices based on their previous performance of that stock. Due to these factors many investors lose a lot of money every year[6]. A system that could predict the stock prices accurately is highly in demand. Individuals can know the predicted stock prices upfront and this may prevent them from investing in a bad stock. This would also mean a lot of saved time for many of the investors who are figuring out wheather a particular stock is good or not. 3. LITERATURE SURVEY

ISSN:0975-887

1. Comparative analysis of data mining techniques for financial data using parallel processing[1] [2014] [IEEE] Do the comparative analysis of several data mining classification techniques on the basis of parameters accuracy, execution time, types of datasets and applications. Simple Regression and multivariate analysis used, Regression analysis on attributes is used. No use of machine learning. Does not provide the algorithm used. 2. Stock market prices do not follow random walks: Evidence from a simple specification test[2] [2015] [IEEE]Test the random walk hypothesis for weekly stock market returns by comparing variance estimators derived from data sampled at different frequencies. Simple trading rules extraction and Extraction of Trading Rules from Charts and Trading Rules. No alternative provided for human investing. Show only the flaws on manual investments. 3. A Machine Learning Model for Stock Market Prediction[3] [2017] [IJAERD] Support Vector Machine with Regression Technology (SVR), Recurrent Neural networks (RNN). Regression analysis on attributes using simple Regression and multivariate analysis used. It is not tested in real market. Shows how social media affects share prices. Does not account for other factors. 4. Twitter mood predicts the stock market[4] [2010] [IEEE] Analyze the text content of daily Twitter feeds by two mood tracking tools, namely Opinion Finder that measures positive vs. negative mood and Google-Profile of Mood States. These results are strongly indicative of a predictive correlation between measurements of the public mood states from Twitter feeds. Difficult to scan each

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 165

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

every text extraction from large set of data, difficult Text mining. 5.Stock Market Prediction on HighFrequency Data Using Generative Adversarial Nets[5] [2017] [Research] Propose a generic framework employing Long ShortTerm Memory (LSTM) and convolutional neural network (CNN)for adversarial training to forecast high frequency stock market. This model achieves prediction ability superior to other benchmark methods by means of adversarial training, minimizing direction prediction loss, and forecast error loss. It Can‘t predict Multi scale Conditions and live data 6. Stock Market Prediction Using Machine Learning[6] [2016] [IEEE] Uses different modules and give different models and give best accuracy using live streaming data. Predict Real Market Data and calculate Live data using single and multilevel perspective, SVM, Radial Bias. It Couldn‘t work Textual Data form different Browsing Data (Web Crawling) 7. Stock Market Prediction by Using Artificial Neural Networks[7] [2014] [IEEE] This model takes help of Artificial Intelligence and uses only neural networks to predict the data. Predicting data using single and multilevel perceptron. It uses 10 hidden layers with the learning rate of 0.4, momentum constant at 0.75 and Max Epochs of 1000. This model doesn‘t use machine learning algorithms like SVM and radial basis function to determine their accuracy. 8. Price trend prediction Using Data Mining Algorithm[8] [2015] [IEEE] This paper presented a data mining approach to

predict the long-term trend of the stock market. The proposed model detects anomalies in data according to the volume of a stock to accurately predict the trend of the stock. This paper only provides long term predictions and does not give predictions to the immediate trends. 5. PROPOSED WORK Stock Market Prediction Using Machine Learning can be a challenging task. The process of determining which indicators and input data will be used and gathering enough training data to training the system appropriately is not obvious. The input data may be raw data on volume, price, or daily change, but also it may include derived data such as technical indicators (moving average, trend-line indicators, etc.)[5] or fundamental indicators (intrinsic share value, economic environment, etc.). It is crucial to understand what data can be useful to capture the underlying patterns and integrate into the machine learning system. The methodology used in this work consists on applying Machine Learning systems, with special emphasis on Genetic Programming. GP has been considered one of the most successful existing computational intelligence methods and capable to obtain competitive results on a very large set of real-life application against other methods. Section Different Algorithms used in algorithm[1]. Tools and Technologies Used  Python  Usage of libraries like – OpenCV, scikit, pandas, numpy  Machine Learning techniques -classifiers  Linear regression techniques  Jupyter IDE

4. GAP ANALYSIS

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 166

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

1

Comparative analysis of data mining techniques for financial data using parallel processing [2014] [IEEE]

2

Stock market prices do not follow random walks: Evidence from a simple specification test [2015] [IEEE]

3

A Machine Learning Model for Stock Market Prediction [2017] [IJAERD]

4

Twitter mood predicts the stock market [2010] [IEEE]

5

Stock Market Prediction on HighFrequency Data Using Generative Adversarial Nets [2017] [Research]

6

Stock Market Prediction Using Machine Learning

ISSN:0975-887

Do the comparative analysis of several data mining classification techniques on the basis of parameters accuracy, execution time, types of datasets and applications.

Simple Regression and multivariate analysis used, Regression analysis on attributes is used

No use of machine learning. Does not provide the algorithm used.

Test the random walk hypothesis for weekly stock market returns by comparing variance estimators derived from data sampled at different frequencies

Simple trading rules extraction and Extraction of Trading Rules from Charts and Trading Rules

No alternative provided for human investing. Show only the flaws on manual investments.

Support Vector Machine with Regression Technology (SVR), Recurrent Neural networks (RNN)

Regression analysis on attributes using simple Regression and multivariate analysis used

It is not tested in real market. Shows how social media affects share prices. Does not account for other factors.

Analysed the text content of daily Twitter feeds by two mood tracking tools, namely Opinion Finder that measures positive vs. negative mood and GoogleProfile of Mood States

These results are strongly indicative of a predictive correlation between measurements of the public mood states from Twitter feeds

Difficult to scan each every text extraction from large set of data, difficult Text mining

Propose a generic framework employing Long Short-Term Memory (LSTM) and convolutional neural network (CNN)for adversarial training to forecast high frequency stock market

This model achieves prediction ability superior to other benchmark methods by means of adversarial training, minimizing direction prediction loss, and forecast error loss

Can‘t predict Multi scale Conditions and live data

Uses different modules and give different models and give best accuracy using live streaming data.

Predict Real Market Data and calculate Live data using single and multilevel perspective, SVM, Radial Bias

Couldn‘t work Textual Data form different Browsing Data (Web Crawling)

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 167

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

6. METHODOLOGY In this project we tried to predict the stock market prices using four different types of SVM and Artificial Neural Networks Algorithms. Support Vector Machine (SVM) In machine learning, support vector machines are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier. Given a set of training examples[7], each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on[6]. In addition to performing linear classification, SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high dimensional feature spaces.

weight vector and the value b as the bias term. The term w.x refers to the dot product (inner product, scalar product), which calculates the sum of the products of vector components. Classification hyper-plane equations: Positive margin hyper-plane equation: w.x –b=1 Negative margin hyper-plane equation: w.x – b = -1 Middle optimum hyper-plane equation: w.x – b = 0

Radial Bias Radial basis function network is an artificial neural network which uses radial basis functions as activation functions. These networks are feed forward networks which can be trained using supervised training algorithms. These networks are used for function approximation in regression, classification and time series predictions[5]. Radial basis function networks are three layered networks where the input layer units does no processing, the hidden layer units implement a radial activation function and the output layer units implement a weighted sum of the hidden unit outputs. Nonlinearly separable data can easily be modeled by radial basis function networks. To use the radial basis function networks we have to specify the type of radial basis activation function, the number of units in the hidden layer and the algorithms for finding the parameters of the network[3].

Figure 3. 1: demonstration of SVM Linear discriminant function: f(x) = w.x+b In this function, x refers to a training dataset vector, w is referred to as the ISSN:0975-887

Figure 3.2: An demonstration of Radial Bias

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 168

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

h(x) = Φ((x - c)T R-1 (x - c)) Where Φ is the function used, c is the center and R is the metric. The term (x - c)T R-1 (x - c) is the distance between the input x and the center c in the metric defined by R. There are several common types of functions used such as Gaussian Φ(z) = e-z, the multi-quadratic Φ(z)=(1+z)1/2, the inverse multi-quadratic Φ(z) = (1+z)-1/2 and the Cauchy Φ(z) = (1+z)-1. Single Layer and Multi-layer Perceptron A single layer perceptron (SLP) is a feed-forward network based on a threshold transfer function. SLP is the simplest type of artificial neural networks and can only classify linearly separable cases with a binary target (1, 0)[1]. The single layer perceptron does not have a priori knowledge, so the initial weights are assigned randomly. SLP sums all the weighted inputs and if the sum is above the threshold (some predetermined value), SLP is said to be activated (output=1). The input values are presented to the perceptron, and if the predicted output is the same as the desired output, then the performance is considered satisfactory and no changes to the weights are made. However, if the output does not match the desired output, then the weights need to be changed to reduce the error[8]. A multi-layer perceptron (MLP) has the same structure of a single layer perceptron with one or more hidden layers. The backpropagation algorithm consists of two phases: the forward phase where the activations are propagated from the input to the output layer, and the backward phase, where the error between the observed actual and the requested nominal value in the output layer is propagated backwards to modify the weights and bias values[5]. 2 Propagation: Forward and Backward

ISSN:0975-887

Figure 3.3: An demonstration of Single Level And Multi Level Perceptron Single and multi-level perceptrons have multiple inputs and a single output. Consider x1,x2,…xn be input vectors and w1,w2,…wn be the weights associated with them[7]. Output a = x1.w1 + x2.w2 + …xn.wn 7. SYSTEM ARCHITECHTURE

8. CONCLUSION In this thesis, we looked at the problem of forecasting stock performance.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 169

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Although a substantial volume of research exists on the topic, very little is aimed at long term forecasting while making use of machine learning methods and textual data sources. We prepared over ten year worth of stock data and proposed a solution which combines features from textual yearly and quarterly filings with fundamental factors for long term stock performance forecasting. Additionally, we developed a new method of extracting features from text for the purpose of performance forecasting and applied feature selection aided by a novel evaluation function. Problems Overcome[5]. To produce effective models, there were two main problems we were faced with and had to overcome. The first was that of market efficiency, which places theoretical limits on how patterns can be found in the stock markets for the purpose of forecasting. This property can become a concrete problem by patterns being exhibited in the data which are useless or even detrimental for predicting future values. The way we tried to deal with this was by carefully splitting our data into training, validation, and testing data with expanding windows so as to make maximum use of it while trying to avoid accidental overfitting. The second way we dealt with this was by using a tailored model performance metrics, which aimed to ensure good test performance of models by not only maximizing model validation, but also minimizing the variation across validation years of this value[7]. The third way we dealt with market efficiency was by performing feature selection using the Algorithm, so as to remove those features which performed poorly or unreliably. The second set of problems came from putting together a dataset to use for experimentation and testing. Due to the large volume of the data, care had to be taken when cleaning and preparing it, and the inevitable mistakes along the way required reprocessing of the data[4]. Using expert knowledge, we determined how to ISSN:0975-887

deal with the various problems in the data and ended up using mean substitution and feature deletion. 9. FUTURE WORK 1. Model Updating Frequency: They are trained once and then used for predicting stock performances over the span of a year. Since we use a return duration of 120 trading days, there is a necessary wait of half a year before data can be used to train models, which means that models end up making predictions using data which is over a year old. One way to make use of data as soon as it become available is to completely retrain the model every week (or less). A faster way to improve model performance may be through updating using incremental machine learning algorithms, which can update model parameters without re-training on all data[6]. 2. Explore More Algorithms: Although many different models were considered in this thesis, including various linear regression methods, gradient boosting, random forests, and neural networks, there is always more room to explore. 3. Improve Feature Extraction: In this thesis, a few methods for extracting features from filings with textual data were explored. The problem of extracting features from text and determining text sentiment in particular are well studied, and other natural language processing methods may perform better. Our approach of using autoencoders to extract features may also benefit from further exploration. In particular, when using the auxiliary loss, a more accurate method for estimating the financial effect corresponding to a given filing would be useful. 4. Utilize Time Series Information.:

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 170

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Similar to the idea of updating model frequency, another area for exploration includes utilizing the time series aspect of the data. Our current models are not aware that the samples occur in any temporal order, and thus are not able to spot patterns in stock performance that depend on knowing the order of samples. One type of model that is often used to find and make use of these type of patterns are recurrent neural networks[9].

[10]

Ibrahim M. Hamed Ashraf S. Hussein, ―An Intelligent Model for Stock Market Prediction‖, The 2011 International Conference on Computer Engineering & Systems

REFERENCES [1] Raut Sushrut Deepak, Shinde Isha Uday ,Dr. D. Malathi, ―Machine Learning Approach in stock market prediction‖-2015 International Journal of Pure and Applied Mathematics Volume 115 No. 8 2017, 71-77. [2] Tao Xing, Yuan Sun, Qian Wang, Guo Yu, ―The Analysis and Prediction of Stock Price,‖ 2013 IEEE International Conference on Granular Computing. [3] A. W. Lo, & A. C. MacKinlay, ―Stock market prices do not follow random walks: Evidence from a simple specification test,‖ Review of financial studies, vol. 1, no. 1, pp. 41-66, 1988. [4] Yash Omer , Nitesh Kumar Singh, ―Stock Prediction using Machine Learning‖, 2018 International Journal on Future Revolution in Computer Science & Communication Engineering. [5] Ritu Sharma, Mr. Shiv Kumar, Mr. Rohit Maheshwari ―Comparative Analysis of Classification Techniques in Data Mining Using Different Datasets‖, 2015 International Journal of Computer Science and Mobile Computing. [6] Osma Hegazy, Omar S. Soliman, ―A Machine Learning Model for Stock Market Prediction‖, International Journal of Computer Science and Telecommunications [Volume 4, Issue 12, December 2013]. [7] S .P. Pimpalkar, Jenish Karia, Muskaan Khan, Satyam Anand, Tushar Mukherjee, ―Stock Market Prediction using Machine Learning‖, International Journal of Advance Engineering and Research Development, vol. 4 2017. [8] Xingyu Zhou , Zhisong Pan , Guyu Hu , Siqi Tang,and Cheng Zhao, ―Stock Market Prediction on High-Frequency Data Using Generative Adversarial Nets‖, Mathematical Problems in Engineering, Volume 2018. [9] J. Bollen, H. Mao, & X. Zeng, ―Twitter mood predicts the stock market. Journal of Computational Science,‖ vol. 2, no. 1, pp. 1-8, 2011. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 171

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

STOCK RECOMMENDATIONS AND PRICE REDICTION BY EXPLOITING BUSINESS COMMODITY INFORMATION USING DATA MINING AND MACHINE LEARNING TECHNIQUES Dr. Parikshit N. Mahalle1, P R Chandre2, Mohit Bhalgat3, Aukush Mahajan4, Priyamvada Barve5, Vaidehi Jagtap6 1,2,3,4,5

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Savitribai Phule Pune University. [email protected],[email protected]@gmail.com3, [email protected], [email protected]

ABSTRACT Abstract Market is an untidy place for predicting since there are no significant rules to estimate or predict the price in share market. Many methods like technical analysis, fundamental analysis, and statistical analysis, etc. are all used to attempt to predict the price in the market but none of these methods are proved as a consistently acceptable prediction tool. In this project we attempt to implement an Artificial Intelligence technique to predict commodity market prices. We select a certain group of raw material and parameters with relatively significant impact on the price of a commodity. Although, market can never be predicted, due to its vague domain, this concept aims at applying Artificial Intelligence in predicting the commodity prices and recommending stock modelling. This System aims to assess the accuracy of prediction by 2 stages and assess the precision of recommendation by the last recommendation stage. Although there is considerable movement between spot and futures prices, futures prices tend to exhibit less variability than spot prices. Hence, futures prices tend to act as an anchor for spot prices, and error-correction models that exploit the long-run integrating relationship provide better forecasts of future spot-price developments. Index Terms— Commodity Prices, Forecast, Prediction 1. INTRODUCTION Stock prices are considered to be chaotic and unpredictable, with very few rules therefore predicting or assuming anything of it is a very tricky business. Predicting the future stock prices of financial commodities or forecasting the upcoming stock market trends can enable the investors to garner profit from their trading by taking calculated risks based on reliable trading strategies. This paper focuses on implementing Machine Learning and Artificial Intelligence to predict commodity prices, so as to help the business providers put their investment and efforts in the right direction to gain maximum profit. The stock market is ISSN:0975-887

characterized by high risk and high yield; hence investors are concerned about the analysis of the stock market and are trying to forecast the trend of the stock market. This paper enhances the idea towards stock recommendation and price prediction, it intends to assess the accuracy of price forecasts for commodities over the past several years. In the view of the difficulties in accurately forecasting future price movements this system aims to achieve more prominent results than others. This project will not only work in the area of increasing sales of the country‘s business providers but also help in managing those sales and keep them on the path of improvement. It will help

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 172

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

people in this area of work to take smart, calculated and informed decisions which will add to the advancement of the field and economy of the country. Design model will work on the data given in past several years and it will be able to improvise itself according to the real-time data that comes along the way. Model will aim to achieve higher level of accuracy towards prediction range and will be adaptable to any kind of data that is given to it. A number of alternate measures of forecast performance, having to do with statistical as well as directional accuracy, are employed. Stock recommendation system will be based on already known data to us, we focus on raw material and dependency variation through artificial intelligence and Machine Learning is the path for our project. 2. MOTIVATION Stock recommendation and prediction is a very tricky business, and forecasting commodity prices relying exclusively on historic price data is a challenge of its own. Spot prices and future prices are nonstationary they form a co- integrating relation. Spot prices tend to move towards future prices over the long run hence predicting the path has become more useful than ever. Fluctuations in commodity prices affect the global economic activity. For many countries, especially developing countries, primary commodities remain an important source of export earnings, and commodity price movements have a major impact on overall performance therefore commodity-price forecasts are key input to policy planning and formulation. Sales is a very crucial aspect when it comes to any developing nation but managing that sales within the country and estimating its future prospects is also very important, recommendation and prediction system will lead us to a standing where estimating the area of maximum outcome will ultimately benefit all business providers and will bring us to a ISSN:0975-887

position where we can invest smartly, knowingly and have maximum outcome. For efficient manufacturing the actual realtime consumption is necessary but, it is not always possible to analyze real-time data hence stock recommendation will give manufactures an overview of stock consumption leading towards lower production cost and in result the end consumer will be benefited.

3. STATE OF ART Stock prices are considered to be chaotic and unpredictable. Predicting the future stock prices of financial commodities or forecasting the upcoming stock market trends can enable the investors to garner profit from their trading by taking calculated risks based on reliable trading strategies. The stock market is characterized by high risk and high yield; hence investors are concerned about the analysis of the stock market and are trying to forecast the trend of the stock market. To accurately predict stock market, various prediction algorithms and models have been proposed in the literature. In the paper proposed by A.Rao ,S.Hule ,Stock Market Prediction Using Statistical Computational Methodologies and Artificial Neural Networks, the focus is on the technical approaches that have been proposed and/or implemented with varying levels of accuracy and success rates. It surveys mainly two approaches – the Statistical Computational Approach and the Artificial Neural Networks Approach. It also describes the attempts that have gone in combining the two approaches in order to achieve higher accuracy predictions. In another work done by K.K.Sreshkumar and Dr.N.M.Elango, An Efficient Approach to Forecast Indian Stock Market Price and their Performance Analysis, the paper reveals the use of prediction algorithms and functions to predict future share prices and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 173

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

compares their performance. The results from analysis shows that isotonic regression function offers the ability to predict the stock prices more accurately than the other existing techniques. The results will be used to analyses the stock prices and their prediction in depth in future research efforts. In this paper, different neural classifier functions are examined and applied by using the Weka tool. By using correlation coefficient various prediction functions are compared, and it is found that Isotonic regression function offer the ability to predict the stock price of NSE more accurately than the other functions such as Gaussian processes, least mean square, linear regression, pace regression, simple linear regression and SMO regression.- The paper, Forecasting Commodity Prices: Futures Versus Judgment by Chakriya Bowman and Aasim M. Husain, assesses the performance of three types of commodity price forecasts—those based on judgment, those relying exclusively on historical price data, and those incorporating prices implied by commodity futures. The analysis here indicates that on the basis of statisticaland directional-accuracy measures, futures-based models yield better forecasts than historical-data-based models or judgment, especially at longer horizons. The results here suggest that futures prices can provide reasonable guidance about likely developments in spot prices over the longer term, at least in directional terms. Another idea was proposed by Andres M. Ticlavilca, Dillon M. Feuz, and Mac McKee, Forecasting Agricultural Commodity Prices Using Multivariate Bayesian Machine Learning Regression, where multiple predictions are performed for agricultural commodity prices. In order to obtain multiple time-ahead predictions, this paper applies the Multivariate ISSN:0975-887

Relevance Vector Machine (MVRVM) that is based on Bayesian learning machine approach for regression. The performance of the MVRVM model is compared with the performance of another multiple output model such as Artificial Neural Network (ANN). Bootstrapping methodology is applied to analyze robustness of the MVRVM and ANN. The MVRVM model outperforms ANN most of the time. The potential benefit of these predictions lies in assisting producers in making better-informed decisions and managing price risk. 4. GAP ANALYSIS Stock Market Prediction Using Statistical Computational Methodologies and Artificial Neural Networks.( A. Rao, S. Hule, H. Shaikh, E. Nirwan, Prof. P. M. Daflapurkar) The paper provides ANNs that are able to represent complex non-linear behaviour‘s. ANN approach here, eliminates the error in parameter estimation.It doesn‘t provide statistical methods are parametric model that need higher background of statistic. An Efficient Approach to Forecast Indian Stock Market Price and their Performance Analysis. (K.K.Sureshkumar, Dr.N.M.Elango) It helps in isotonic regression which is not constrained by any functional form, such as the linearity imposed by linear regression, as long as the function is monotonic increasing but it does not fit derivatives, so it will not approximate smooth curves like most distribution functions. .Forecasting Commodity Prices: Futures versus Judgment. (Chakriya Bowman, Aasim M. Husain) Solves the given problem with Root Mean Squared Error (RMSE) gives a measure of the magnitude of the average forecast error, as an effectiveness measure but RMSE is a measure that is commodity

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 174

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

specific, and cannot be readily used for comparison across commodities. Forecasting Agricultural Commodity Prices Using Multivariate Bayesian Machine Learning Regression. (Andres M. Ticlavilca, Dillon M. Feuz and Mac McKee) The dependency between input and output target is learned using MVRVM to make accurate predictions. The potential benefit of these predictions lies in assisting producers in making better-informed decisions and managing price risk but sparse property (low complexity) of the MVRVM cannot be analyzed for the small dataset.nForecasting Model for Crude Oil Price Using Artificial Neural Networks and Commodity Futures Prices. (Siddhivinayak Kulkarni, Imad Haidar) In this paper, ANN is selected as a mapping model, and viewed as nonparametric, nonlinear, assumption free model which means it does not make a priori assumption about the problem but if the assumptions are not correct in econometrical model, it could generate misleading results.

5. PROPOSED WORK This paper proposes an artificial intelligent system prediction and recommendation as this is the heart and brain of entire process, here the data set noise elimination, and learning and prediction stage is going to occur. The data provided to the system should be relevant and labelled, in order to identify the parameters and predict the patterns it learned. The system must understand the pattern between the data parameter at faster rate because that is important to speed up the calculation process for predicting values in future. Artificial intelligence is based on machine learning technique known as decision learning tree so it must select the ideal parameters in order to understand the pattern and predict the values.

ISSN:0975-887

System architecture starts with client using any web browser to access the server and add up to his data, this data is further observed and is used to generate alpha 1 and alpha 2 with respect to current and historic data which is necessary for further prediction process. It is tested whether the newly acquired data possess any abnormalities or not, and if it does then data is sent for noise removal and process which then goes in the section which combines new data and historic data, if there is no noise in new data, it directly goes for combination with historic data. History is updated after combining new acquired data and it is sent for training of the system, this system keeps on training as new data keeps on adding and system then becomes capable of training itself after a certain point of time. .

Fig 1: System Architecture

5.1 Training Stage This is the starting phase of the software cycle. This is where the system starts to learns and understand the patterns of the commodity and then it starts to predict the prices of the commodity. This stage is being divided into 2 stages one where the system learns the dependency of the factor for commodity and second the external factors affecting the prices of the commodity In First stage the

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 175

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

system creates the cluster of algorithms. Then it identifies the dependency of the commodity and the raw materials dependency by studying the factors of the raw material and then after learning the dependency it progresses to choosing the initial based algorithm based on the factor it chose for predicting. After the selection of the algorithm the AI tries to construct sequence to train machine. And the finally train the machine to understand the raw material dependency for commodity. In Second stage the system creates the cluster of raw material data collected from the first stage. Then it learns and the external factor affecting the raw material price fluctuation for example inflation, import export factor and then after learning the external factors it progresses to choosing the initial state of probability based of commodity pattern. After the selection of the state of probability the AI tries to construct sequence to train machine and the finally train the machine

Fig 2: Training Stage

5.2 Prediction Stage In the Prediction stage, our system will generate the pattern based on the historical data. Then the discovered pattern will be added in the existing sequence of the patterns. Using the combination of the discovered pattern and also the existing sequence of patterns, system will predict some value which we call as alpha. Then the test will take place to check the behavior of the alpha. If according the test, the alpha has normal value, then it will be added to the existingsequence of values else it will be ISSN:0975-887

considered as anomaly and the model will be retrained.

Fig 3: Prediction Stage

5.3 Recommendation Stage This stage is the final stage where the clients can access and get recommendation based on the commodity they want to buy in this stage first of all the system will identify the inventory management of the business owner and then choose the probability state based on their sales and purchase and prices. The AI will construct sequence from the data it was fed with and the, it will try to implement it on the machine. After learning the inventory management, it will try to recommend the business owner on the basis of pattern. It will actually track the inventory through proprietary based GIBS algorithm will help it to understand the flow of the inventory and finally the recommendation will be tested There can be two output of the test Normal -If the test satisfies the test condition then the pattern of the inventory will be added to the system in order to recommend it in future Anomaly-This stage is the demerit of the system to correctly identify the inventory pattern so it is sent to the beginning that is from identification inventory management.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 176

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Fig 4: Recommendation Stage

6. CONCLUSION AND FUTURE WORK The purpose of the project is to improve the overall sales of the market, and increase export of the nation. Stock prediction and commodities recommendation system provides a step towards smart investments and huge profit margins. The progress that this system will bring to the market will be revolutionary and recommendation and prediction system will lead us to a standing where estimating the area of maximum outcome will ultimately benefit all business providers and will bring us to a position where we can invest smartly, knowingly and have maximum outcome. This project could be expanded to a wide range of commodities with the proper support of technology in the future. 7. ACKNOWLEDGMENT It is a long list, but the most important people include our Guide Prof. P.R.Chandre under whose guidance we are able to learn, explore and flourish in experience. I am also grateful to Dr. P. N. Mahalle, Head of Computer Engineering Department, STES. Smt. Kashibai Navale College of Engineering for his indispensable support, suggestions. It would be incorrect to hesitate while mentioning a special thank you to our

ISSN:0975-887

college, and in turn, the Department of Computer Engineering for presenting this opportunity to us. We are grateful for the exposure given to us in the same regards. The team members- Ankush, Mohit, Priyamvada, Vaidehi need to be explicitly thanked for their individual and co-operative contribution in the constant progress of this project. This is in general, a large thank you and wide smile to all those who directly or indirectly influenced the course of this project. Last, but not in the least, we thank our respective parents for their unwavering support and help. REFERENCES [1] Market Prediction Using Statistical Computational Methodologies and Artificial Neural Networks‖, International Research Journal of Engineering and Technology (IRJET). [2] K.K. Sureshkumar and Dr.N.M. Elango, ―An Efficient Approach to Forecast Indian Stock Market Price and their Performance Analysis‖, International Journal of Computer Applications (0975 – 8887). [3] Chakriya Bowman and Aasim M. Husain,‖ Forecasting Commodity Prices: Futures versus Judgment‖. [4] Andres M. Ticlavilca,Dillon M. Feuz and Mac McKee,‖ Forecasting Agricultural Commodity Prices Using Multivariate Bayesian Machine Learning Regression‖. [5] Enke, D. and S. Thawornwong (2005),‖ The Use of Data Mining and Neural Networks for Forecasting Stock Market Returns‖. Expert Systems with Applications 29:927-940. [6] Cumby, R.E., and D.M. Modest, 1987, ―Testing for Market Timing Ability: A Framework for Forecast Evaluation‖, Journal of Financial Economics, Vol. 19(1). [7] Mills, T.C., 1999 The Econometric Modeling of Financial Time Series, Cambridge University Press, Cambridge, United Kingdoms. [8] Irwin, S.H., M.E. Gerlow and T. Liu, 1994,‖ The Forecasting Performance of Livestock Future Prices: AComparison to USDA Expert Predictions‖, Journal of Futures Market Vol. 14(7). [9] E. Bopp, S. Sitzer, ―Are Petroleum futures prices good predictors of cash value?‖, The Journal of Futures Market, 1987.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 177

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

A MACHINE LEARNING MODEL FOR TOXIC COMMENT CLASSIFICATION

Mihir Pargaonkar1, Akshay Wagh2, Rohan Nikumbh3, Prof. D.T. Bodake4, Shubham Shinde5 1,2,3,4,5 Dept. of Computer Engineering, SKNCOE Pune, India [email protected], [email protected], [email protected], [email protected], [email protected] ABSTRACT With rapidly expanding personal content and opinions on social-media web platforms, there is an urgent need to protect their owners from abuses and threats. With the user bases of popular platforms like Reddit, Facebook, Twitter etc. clocking over 500 million and growing, a time-efficient and privacy protective solution to tackle ‗cyber bullying‘ is an automated one that understands a user‘s comment and flags them if inappropriate. Social media platforms, online news commenting spaces, and many other public forums of the Internet have become increasingly known for issues of abusive behavior such as cyber bullying, threats and personal attacks. We present our work on detection and classification of derogatory language in online text, where derogatory language is defined as ―using, containing, or characterized by harshly or coarsely insulting language‖. While derogatory language against any group may exhibit some common characteristics, we have observed that it is typically characterized by the use of a small set of high frequency stereotypical words making our task similar to that of text classification. Automating the process of identifying abuse in comments would not only save websites time but would also increase user safety and improve the quality of discussions online. Keywords-Natural Language Processing (NLP), Toxic Comment Classification (TCC), Machine Leaning (ML). 1. INTRODUCTION The threat of abuse and harassment online means that many people stop expressing themselves and give up on seeking different opinions. Platforms struggle to effectively facilitate conversation leading many communities to limit or completely shut down user comments. As discussions increasingly move toward online forums, the issue of trolls and spammers is becoming increasingly prevalent. Manually moderating comments and discussion forums can be tedious, and to deal with the large volume of comments, companies often have to ask employees to take time away from their regular work to sift through comments or are forced to hire contracted or outside moderators. Without careful moderation, social media companies like Reddit and Twitter have ISSN:0975-887

been criticized for their enabling of cyber bullying. According to a recent survey conducted by Microsoft 53% of Indian children between the age of 8 and 17 were bullied and India was ranked 3rd in cyber bullying which is of much concern. Many people are abused and harassed online in many ways which may affect them seriously and may lead to serious situations. So it is necessary to keep a control on the online comments and discussions by classifying them and taking action accordingly. This project will identify toxicity in text, which could be used to help deter users from posting potentially hurtful messages, craft more civil arguments when engaging in discourse with others and to gauge the toxicity of other users‘ comments. The proposed system uses NLP and Machine Learning techniques to create an

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 178

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

intelligent classifier which can understand the meaning of the sentence and classify into six categories of toxicity: toxic, severe toxic, obscene, threat, insult and identity hate 2. LITERATURE SURVEY NLP and Machine learning is used for analyzing the social comment and identified the aggressive effect of an individual or a group. Over the past few years, several techniques have been proposed to measure and detect offensive or abusive content/behavior on platform like Instagram, YouTube and Yahoo Answers. Some possible features could be – Lexical Syntactic Features, TF-IDF (Term Frequency – Inverse Document Frequency), count of offensive words in a sentence, count of positive words in a sentence, etc. The current technologies like part of speech, URLs BoW (Bag of Words), lexical features are useful for our study on this context. In this study we made two main categories bullies and non-bullies and the use of probabilistic sentiment analysis approach is used for filtering in these two categories. Huang et al. specifically, they chose to use LSTMs because it solves the vanishing gradient problem. [1] In this paper detection techniques for comments classification, which are based on two machine learning algorithm supervised and unsupervised learning are used.Machine learning supervised approach includes different type of decision tree algorithm, Naïve bays algorithm, Regular pattern matching Sr. Year Authors Synopsis No.

1.

2009

ISSN:0975-887

Dawei Yin et. al.

algorithm, K-nearest Neighbor algorithm, novel technique and most popular and used algorithm is support vector algorithm. Most authors used SVM (support vector algorithm) for classification purpose. Two API‘s are developed by companies for toxic comment classification by Google and Yahoo.Google Counter Abuse Technology team developed one Perspective API in Conversion-AI. Machine Learning Tool used in Conversion-AI as collaborative research effort, which makes better discussions online. Using Machine learning models The API create score for toxicity of an input text. Limitation of Perspective API: This API can only classify comments related to English language. Identifies abusive comment based on predefined set of data. If new comments are written down which are not matched with the stored dataset, then toxicity could not be determined. Yahoo developed ―Yahoo‘s anti abuse AI‖ which can hunt out even the most devious online trolls. This uses Aho-Corasick string pattern matching algorithm for detecting abusive comments. The accuracy of correctly detection of offensive word is 90%. Limitation of Yahoo API: Problem is to build a system that can detect whether or not any given comment is insulting. With such a system, website owners would have a lot of flexibility in dealing with this problem. At this time there is no system deployed anywhere on social media platforms etc. [2]

Limitation

The supervised learning The experiments were was used for detecting done using supervised harassment. This methods. The temporal or

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 179

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

technique employs content user information was not features, sentiment fully utilized. features, and contextual features of documents with significant improvements over several baselines, including Term Frequency Inverse Document Frequency approaches. 2.

2012

Warner Hirschberg

& In this work, the authors show a way to perform sentiment analysis in blog data by using the method of structural correspondence learning. This method accommodates the various issues with blog data such as spelling variations, script difference, pattern switching. By comparing with English and Urdu languages.

3.

2012

Xiang el al

Semi-supervised approach was applied for detecting offensive content on twitter using machine learning (ML) algorithms.

As a result, some constraint in mixing two languages like ―bookon‖ in Urdu seems in English as ―books‖ their tagger ignores such kind of offensive word.

The focused was on word level distribution and 860,071 Tweets. Not able to cope up with the complex feature, complex weighting mechanism and In the experiment, the true with more data. positive rate was 75.1% over 4029 testing tweets using Logistic Regression, which has a TP of 69.7%, while keeping the false positive rate (FP) at the same level as the baseline at about 3.77%.

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 180

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

4.

2013

Dadvar et al

5.

2015

Kansara Shekokar

An improved Need to improve the cyberbullying system detection accuracy for the which classifies the users‘ offensive comments. comments on YouTube using content-based, cyberbullying-specific and user-based features by applying support vector machine. & A framework detects only Not able to detect audio abusive text messages or and video which are images from the social offensive. network sites by applying SVM and Naïve Bayes classifiers. Table 1: Analysis of Related Work

Wikipedia talk page data is used to train deep learning models to flag comments. Three models are tested: a recurrent neural network (RNN) with a long-term short memory cell (LSTM) and word embeddings, a CNN with character embeddings and a CNN with word embedding‘s. Comment abuse classification research with machine learning began with Yin, et al.‘s paper, in which the researchers use a support vector machine and apply TF-IDF to the features. More recently, research into applying deep learning to related fields such as sentiment analysis has proven quite fruitful. Zhang and Lapata used Recurrent neural networks have been known to perform well in sentiment analysis. Wang, et al. used LSTMs to predict the polarity of Tweets and performed comparably to the state-of-theart algorithms of the time. Huang, Cao, and Dong found that hierarchical LSTMs allow rich context modeling, which enabled them to do much better at sentiment classification. Convolutional neural networks have also been used for sentiment analysis. Nogueira dosSantos and Gatti experimented with CNNs using various feature embedding‘s, from ISSN:0975-887

character- to sentence-level. Characterlevel embedding‘s performed better than the other embedding‘s on one dataset and performed equally as well on the other. Mehdad and Tetreault added more insight into using character-level features versus word-level features through their research. It is clear that RNNs, specifically LSTMs, and CNNs are state-of-the-art architectures for sentiment analysis. Given the similarities between comment abuse classification and sentiment analysis, we hope to use this research to inform our approach and methodology. [3] Abusive language detection, which is inherently formulated as classification problem multiple works are done till date with extensive usage of deep learning, Naïve Bayes, SVM and Tree based approaches. In this paper systems are developed using Gaussian Naive Bayes, Logistic Regression, K-Nearest neighbors, Decision Trees, Multilayer perceptron and Convolutional Neural Networks(CNN) in combination with word and character embedding‘s. A convolutional neural network used by using multichannel model with five input channels for processing 2-6 grams of input

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 181

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

malware text. Following CNN use of FCNN to concatenated feature map to a probability distribution over two classes. To handle over-fitting, we use regularization via dropout. [4] This paper investigates the effect of various types of linguistic features for training classifiers to detect threats of violence in a corpus of YouTube comments. Their data set consists of over 50,000 YouTube comments taken from videos about controversial topics. The experiments reported accuracies from 0.63 to 0.80, but did not report precision or recall. There has been quite a bit of work focused on the detection of threats in a data set of Dutch tweets which consists of a collection of 5000 threatening tweets. The system relies on manually constructed recognition patterns in the form of ngrams, but details about the strategy used to construct these patterns are not given. A manually crafted shallow parser is added to the system. This improves results to a precision of 0.39 and a recall of 0.59. The results show that combination of lexical features outperforms complex semantic and syntactic features. Warner and Hirschberg (2012) present a method for detecting hate speech in usergenerated web text, which relies on machine learning in combination with template-based features. The task is approached as a word-sense disambiguation task, since the same words can be used in both hateful and non-hateful contexts. The features used in the classification were combinations of uni-, bi- and trigrams, part-of-speech-tags and Brown clusters. The best results were obtained using only unigram features. The authors suggest that deeper parsing could reveal significant phrase patterns. [5]

ISSN:0975-887

3. GAP ANALYSIS Current State Current online discussion platforms are much susceptible to abusive behavior and are pretty ineffective in the detection, classification and regulation of toxic comments to prevent hurtful discussions. There exists a lack of publically available APIs for effective categorization of toxic comments online. Ideal Future State: Online discussion platforms can effectively evaluate the toxicity of comments that are being published by its users and accordingly take the desired action based on the category of toxicity. Bridging the Gap: The proposed system will provide access to toxic comment classification machine learning model through an API which can be used by online discussion platforms such that the users‘ comments would be effectively classified into the following six categories-toxic, severe toxic, obscene, threat, insult, and identity-hate. 4. SYSTEM FEATURES Functional Requirements 1. The model has an input interface to the user(calling entity) through which 'comments' can be given for classification. 2. The model can predict and classify a comment into the following 6 categories – toxic, severely toxic, obscene, threat, insult, identity hate. 3. The model has an output interface to the calling entity which provides information about the categories to which the input comment belongs. Software requirements: For development: Python3, NumPy, Pandas, Keras, scikit-learn, Spyder/PyCharm, Jupyter Notebook, Twitter API. Hardware requirements: Processor: 2.9 GHz Processor, Recommended Intel core i5 processor Ram: At least 4 GB RAM

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 182

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Figure 1: System Architecture

5. PROPOSED WORK The proposed system is a multi-label classification machine learning model which will be able to accurately predict the categories of toxicity into which a comment provided by the client belongs to. It is designed to categorize comments into the following six categories of toxicity - toxic, severely toxic, obscene, threat, insult and identity hate. As per the system architecture, the proposed system which is named as 'toxic comment classifier' comprises of three main components Text Processing unit, the Classifier and the Response Generator. The major tasks of these components are: 1. Text Preprocessing unit - The task of this component is to apply common text processing operations on the raw comment ISSN:0975-887

obtained from the client. This includes activities like removal of irrelevant text like dates, IP addresses, numbers, stop words, etc. This cleansing action is essential since it can hugely affect the accuracy and response time of the classier. 2. The Classifier - This component is the actual machine learning model developed using the most suitable algorithm which will be used to evaluate and categorize the comments sent by the client into the appropriate categories. 3. Response Generator - The task of this component is to capture the results of the classifier and convert them into a suitable format to send it as a response to the client via the web API. The proposed system will be made accessible to its clients in the form of a

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 183

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Web Application Programming Interface (API).This makes it easy for the clients as they are provided with a uniform interface irrespective of the client. The clients are expected to just pass the comments made by users on their platform in JSON format to the proposed system's web API for their evaluation. The comment then progresses through the three components of the Toxic Comment Classifier and the response is sent back to the respective client via the web API. System Parameters 1. Response Time - Since, typically, any online discussion platform will have several active users who are posting and updating comments, the process of evaluation of comments and corresponding response generation must be quick to ensure that the users are not forced to wait for an unsatisfactorily long period of time. Thus, the proposed system is expected to provide a response to its clients in less than 4 seconds (assuming good network connectivity). 2. Cost - The cost associated with the proposed system is only for 'training' the machine learning model which varies from platform to platform depending on various factors like GPU specifications, memory size, training time etc. 3. Scalability -During peak online traffic, it is important to make sure that the proposed system's response does not slow down. Thus, as the system is designed in the form of an API, it can be easily scaledup by replicating and deploying it on multiple servers so as to satisfy larger number of incoming requests efficiently. 4. Accessibility –The proposed system is easily accessible in the form of an API to all its clients through a uniform interface. 6. CONCLUSION AND FUTURE WORK To tackle the severe issue of abuse and harassment on social media platforms and to improve the quality of online discussions thereby mitigating harmful ISSN:0975-887

online experiences is the need of the hour. The proposed system thus provides online social media utilities and other such discussion platforms the ability to assess the quality of users' comments by their classification into various kinds of toxicity using techniques like Natural Language Processing and machine learning algorithms. Based on the results provided by the system, the communication platforms can decide the suitable course of action to be taken on such comments and hence ensure that its users have a better, safer and harmless online experience. The goals of future work on toxic comment classification are to make initial admission decisions reliable, decrease the number of false calls and to make the QoS guarantees more robust in the face of network dynamics. There are users from various backgrounds, cultures which read and write in their native languages apart from English so it may be difficult to identify the toxic comments in their local languages. This problem can be countered using CNN or Deep Learning in future. In future, the system can be improved with the advancements in fields of NLP, ML, AI, Speech Synthesis etc. REFERENCES [1] Hitesh Kumar Sharma, K Kshitiz, Shailendra. ―NLP and Machine Learning Techniques for Detecting Insulating Comments on Social Networking Platforms‖ 2018 [2] Pooja Parekh, Hetal Patel. ―Toxic Comment Tools: A Case Study‖ 2017 [3] Theodora Chu, Kylie Jue. ―Comment Abuse Classification with Deep Learning‖ [4] Manikandan R, Sneha Mani. ―Toxic Comment ClassificationAn Empirical Study‖ 2018 [5] Aksel Wester, Lilja Ovrelid, Erik Velldal, Hugo Lewi Hammer. Threat Detection in Online Discussions [6] S. Bird, E. Klein, and E. Loper, ―Natural language processing with python.‖ 2014. http://www.nltk.org/book/ch02.html [7] J. Pennington, R. Socher, and C. D. Manning, ―Glove: Global vectors for word representation,‖ 2018.https://nlp.stanford.edu/projects/glove/

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 184

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[8] Ivan, ―Lstm: Glove + lr decrease+bn+cv,‖ 2018. https://www.kaggle.com/demesgal/lstmglove-lr decrease-bn-cv-lb-0-047 [9] A. Srinet and D. Snyder, ―Bagging and boosting‖. https://www.cs.rit.edu/~rlaz/prec20092/slides/ Bagging_and_Boosting.pdf [10] T. Cooijmans, N. Ballas, C. Laurent, and A. C. Courville, ―Recurrent batch normalization,‖ CoRR, 2017. https://arxiv.org/pdf/1603.09025.pdf

ISSN:0975-887

[11] A. Pentina and C. H. L. 1, ―Multi-task learning with labeled and unlabeled tasks,‖ 2017. http://pub.ist.ac.at/~apentina/docs/icml17.pdf [12] Kaggle, ―Toxic comment classification challenge‖,2018. https://www.kaggle.com/c/jigsaw-toxic comment-classification-challenge/leaderboard [13] ―Threat detection in online discussions‖ 2016 - Aksel Wester and Lilja Øvrelid and Erik Velldal and Hugo Lewi Hammer. Department of Informatics University of Oslo.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 185

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

HOLOGRAPHIC ARTIFICIAL INTELLIGENCE ASSISTANCE Patil Girish1, Pathade Omkar2, Dubey Shweta3, SimranMunot4

1,2,3,4

Dept.Computer Engineering, Shri.ChhatrapatiShivajiMaharaj College of Engineering Ahmednagar, India [email protected], [email protected], [email protected], [email protected]

ABSTRACT The current AI assistant systems are used to take user speech as input and process it to give the desired output. But the current available systems are the Virtual Private Assistant(VPA‘s). This means you can communicate with the assistant but is not visible to you. So the proposed system will allow you to interact with 3D Holographic Assistant and you can provide input in the form of Speech, Gesture, Video Frame, etc. And will also take form of any object to give detailed idea of required object. This system will be used to increase the interaction between humans and the machines by using 3D Holographic projection in thin air like a real object and makes the holographic effect more realistic and interactive. The system can detect the age of the person with provided input and provide the results accordingly. The system can be integrated within the smartphones for providing inputs and outputs. This system can be used in other different areas of applications, including education assistance, medical assistance, robotics and vehicles, disabilities systems, home automation, and security access control. System can also be used in shops, malls and exhibition to visualize the object in 3D Holographic format instead of real object. Keywords Holographic Artificial Intelligent Assistant; Natural Language Processing; Image Recognition; Gesture Recognition.

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 186

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

1. INTRODUCTION While using the AI Assistant that are currently present we can face a problem that if sometimes the mike of the device fails we are unable to interact with the Assistant. This may createa interrupts in interaction. And also while using the current assistant we are not able to visualize them, they are virtually present so we cannot see them. Also while the kids are using it there are a few concepts that needs to be visualized for better understanding. The proposed system involves the Multi-Model system in combine with the Holographic view, this includes the advancement in computer graphics and multimedia technologies the way human view and interact with the virtual world, such as the augmented reality (AR) and the hologram display. The usage of AR display devices, such as Smartphone‘s and smart glasses, allow the user to receive additional information, which is in the form of informativegraphics based on his or her field of view through the devices, for example, the street‘s name, navigation arrow to lead the user to the destination, etc. On the other hand, the use of holographic pyramid prism can produce the holographic results that displayed the 3D objects in the real world environment, by letting the user to look at different perspective of these holograms when viewing from different angles. This system can also be used in the education system to improve the experience of the learning. This will create the better understanding effect in mind of the students. Also it can be used in malls for demonstration of the material, in case if the material is not available and it will soon be arrived then also the customer can view it using this Holographic AI Assistant 2. EXISTING SYSTEM The current Existing systems are as shown below:

ISSN:0975-887

Fig 1: Existing Virtual AI Assistance System

As shown in above fig.1 they are the current existing systems which are the virtual AI Assistance system. They are the systems which do not show the assistance in front of you. They are also the systems which accept the simple input mode that is Speech or Text. They are no able to take input in the form of video frames, images, gestures, etc. They are not much interactive.

3. PROPOSED MODEL This proposed model gives the advance version for the present Existing system. It combines 2 concepts as Holographic projection and Artificial Intelligent Assistant

Fig 2: Architecture of Proposed System

The above shown in fig.2 is the architecture of the proposed system as shown in it the system consist of the transparent box and the monitor is placed in the top part of the box. Inside the box the glass prism is been set at angel. This will help for displaying the projection. The inside projection will consists of the simple

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 187

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

human animation. This animation will make the same effects as the human in certain conditions. As per mention by Authors calculation and the dimensions would be as shown below in Fig.3 [3].

Fig 3.2: Gesture as input

Video Frames: In this module the video frames will be given as input and the data will be decoded in it.

Fig 3: Dimensions used

a.

Input Module: The system will be able to take and recognize the input in different modes. The modes will be: Speech: In this the simple speech will be taken as input decoded and result will be provided.

Fig 3.3: Video Frames as input

b.

Output Module: The output module will be in the given form:

Gesture: In this the input can be given in form of the gesture. That is the user will need to perform the action and they will be recognized and proper output will be shown.

Fig.4 Output Module With Assistant

The above shown in Fig.4 is the output module. In case of proper understanding the displayed assistant

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 188

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

will take form of the object as shown in Fig.5.

Fig.7 NLP

e. Knowledge Base :

Fig.5 Output module with object

c.

Interaction Module: As per mention by VetonKepuska this module consists of the way the interaction is made. It describes how the interaction is made. The Fig. 6 shows it [1].

Fig.6 Interaction Module

This is the module that describes the way the interaction is going to take place. d. Natural Language Processing(NLP): This module gives the proper understanding of NLP which is the basic concept for speech recognition in multimodal system. The Fig.7 shows the proper NLP Structure. ISSN:0975-887

Proposed system consists of two knowledge bases. The first one is online and second one is offline where all the data and the facts such as facial and body datasets for gesture module, speech recognition knowledge base, image and video dataset and some user information related to modules will be stored. 4. EXPERIMENTAL RESULTS While researching the results which were generated while using single modal AI assistants,we considered efficiency and the correctness as important measures. With the increasing functionalities, the concern of user experience regarding voice recognition, visualization experience , fast tracing of hand gestures ,which we have introduced in Holographic Assistant has been a challenge need to be overcome. Efficiency: In comparison with the old AI assistants, the Holographic Assistant will prove to be more accurate while using advance technologies such as Natural Language Processing . Accuracy: On the other hand, the accuracy of the holographic assistant would be better which would handle challenges like noise and accents. Whereas the existing modals were more error prone. Cost: One of the profitable things about this AI assistant is it's almost free of cost. The overall pre-requisites apart from available softwares is a transparent glass and a monitor screen. Hence, this system would be affordable for all kind of vendors out in the market who will be

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 189

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

ready to take innovations on new levels in their businesses. REFERENCES [1] Veton Kepuska, Gamal Bohouta, NextGeneration of Virtual Personal Assistants (Microsoft Cortana,Apple Siri, Amazon Alexa and Google Home),2018 IEEE. [2] Mrs. Paul Jasmin Rani, Jason Baktha kumar, Praveen Kumaar.B, Praveen Kumaar.U and Santhosh Kumar, Voice controlled home automation system using natural language processing (nlp) and internet of things (iot).2017Third International Conference on Science Technology Engineering & Management (ICONSTEM). [3] Chan Vei Siang, Muhammad Ismail Mat Isham ,Farhan Mohamed, Yusman Azimi Yusoff, Mohd Khalid Mokhtar, Bazli Tomi, Ali Selamat, Interactive Holographic Application

ISSN:0975-887

using Augmented Reality Edu Card and 3D Holographic Pyramid for Interactive and Immersive Learning, 2017 IEEE Conference on e-Learning, e-Management and e-Services (IC3e). [4] R. Mead. 2017.Semio: Developing a Cloudbased Platform for Multimodal Conversational AI in Social Robotics. 2017 IEEE International Conference on Consumer Electronics (ICCE). [5] ChukYau and Abdul Sattar,Developing Expert System with Soft Systems Concept,1994 IEEE. [6] Inchul Hwang, Jinhe Jung, Jaedeok Kim, Youngbin Shin and Jeong-Su Seol, Architecture for Automatic Generation of User Interaction Guides with Intelligent Assistant, 2017 31st International Conference on Advanced Information.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 190

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

PERSONAL DIGITAL ASSISTANT TO ENHANCE COMMUNICATION SKILLS Prof G Y Gunjal1,Hritik Sharma2,Rushikesh Vidhate3,Rohit Gaikwad4,Akash Kadam5 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(k), Pune, India. 1 [email protected] , [email protected], [email protected], [email protected], [email protected]

ABSTRACT The development of the information technology and communication has been complex in implementing of artificial intelligent systems. The systems are approaching of human activities such as decision support systems, robotics, natural language processing, expert systems, etc.In the modern Era of technology, Chatbots is the next big thing in the era of conversational services. Chatbots is a virtual person who can effectively talk to any human being using interactive textual skills. GENERAL TERMS NLP - Natural Language Processing NLU - Natural Language Understanding NLG - Natural Language Generation NLTK- Natural Language Toolkit organized in a way that supports reasoning 1. INTRODUCTION Chatbots are ―online human-computer about the structures and behaviors of the dialog system with natural language.‖ The system. System architecture of the system first conceptualization of the chatbot is consists of following blocks attributed to Alan Turing, who asked ―Can machines think?‖ in 1950. Since Turing, chatbot technology has improved with advances in natural language processing and machine learning. Likewise, chat bot adoption has also increased, especially with the launch of chatbot platforms by Facebook, Slack, Skype, WeChat ,Line, and Telegram. Not only that, but nowadays there is also a hybrid of natural language and intelligent systems those could understand human natural language. These systems can learn 3. OVERALL DESCRIPTION themselves and renew their knowledge by Product Perspective reading all electronics articles those has Most of the search engines today, like been existed on the Internet . Human as Google, use a system (The Pagerank user can ask to the systems like usually did Algorithm) to rank different web pages. to other human. When a user enters a query, the query is 2. SYSTEM ARCHITECTURE The System Architecture is the conceptual model that denes the structure, behavior, and more views of a system. An architecture description is a formal description and representation of a system, ISSN:0975-887

interpreted as keywords and the system returns a list of highest ranked web pages which may have the answer to the query. Then the user must go through the list of webpages to find the answer they are looking for.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 191

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Product Features The major features for Drexel Chatbot will be the following: ● Natural Language Processing:The system will take in questions written in standard ● English Natural Language Responses:The answer to the question will be written in standard and understandable English. ● Information Extraction:There will be a database containing all the information needed, populated using information extraction techniques. User classes and characteristics Primary User: The main User Class that is going to use this product. The product frequency of use could be on a daily basis as every student, every employee needs to improve their communication and personal skills. Mobile/Web app users: These are the users who want to improve their communication in English. These users inputs sentences to system and get response with mobile, web, or text messaging interfaces. This class of users include students, corporate peoples, and

anyone who is interested in improving their communication skills. 4. FIGURES The purpose of a component diagram is show the relationship between different components in a system. For the purpose of UML 2.0, the term "component" refers to a module of classes that represent independent systems or subsystems with the ability to interface with the rest of the system. There exists a whole development approach that revolves around components: component-based development (CBD). In this approach, component diagrams allow the planner to identify the different components so the whole system does what it's supposed to do. Component diagrams are integral part of designing a system. Drawn with different types of software which supports UML diagram. They help to understand the system structure and to design new ones. The component diagrams are used to show relationships among various components.

Fig 2: Component Diagram

5. CONCLUSION The development of chat bot application in various programming language had been done with making a user interface to send ISSN:0975-887

input and receive response. A chat bot is a rising trend and chat bot increases the effectiveness of business by providing a better experience with low cost. A simple

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 192

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

chat bot is not a challenging task as compared to complex chatbots and developers should understand and consider the stability, scalability and flexibility issues along with high level of intention on human language. In short, Chatbot system is moving quite fast and with the passage of time new features are added in the existing platform. Recent advancements in the machine learning techniques may able to handle complex conversation issue such as payments correctly. 6. FUTURE SCOPE The scope of our application in future is by extending the knowledge database with more advanced datasets and including support for more languages as well. Providing users with more detailed reports of their previous performances, so that this could lead to improve the pace of user‘s skill development. We also plan to extend the web application into native mobile apps. 1. ACKNOWLEDGMENTS We would like to take this opportunity to thank our internal guide Prof. G.Y. Gunjal for giving us all the help and guidance we needed. We are really grateful to him for

ISSN:0975-887

his kind support. His valuable suggestions were very helpful. We are also grateful to Dr. P. N. Mahalle, Head of Computer Engineering Department, STES' Smt. Kashibai Navale College of Engineering for his indispensable Guidance, support and suggestions REFERENCES [1] AM Rahman, Abdullah Al Mamun, Alma Islam, ―Programming challenges of Chatbot: Current and Future Prospective‖ Region 10 Humanitarian Technology Conference ( 2017) [2] Bayu Setiaji, Ferry Wahyu Wibowo, ―Chatbot Using A Knowledge in Database‖ 7th International Conference on Intelligent Systems, Modelling and Simulation (2016) [3] Anirudh Khanna Bishwajeet Pandey Kushagra Vashishta Kartik Kalia Bhale Pradeepkumar [4] Teerath Das, ―A Study of Today‘s A.I. through Chat bots and Rediscovery of Machine Intelligence‖ International Journal of u- and eService, Science and Technology Vol.8, No. 7 (2015) [5] Sameera A. Abdul-Kader Dr. John Woods, ―Survey on Chatbot Design Techniques in Speech Conversation Systems‖ International Journal of Advanced Computer Science and Applications, Vol. 6, No. 7 (2015) [6] https://www.altoros.com/blog/how-tensorflowcan-help-to-perform-natural-languageprocessing-taksk https://media.readthedocs.org/pdf/nltk/latest/nl tk.pdf.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 193

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

FAKE NEWS DETECTION USING MACHINE LEARNING Kartik Sharma1,Mrudul Agrawal2,Malav Warke3,Saurabh Saxena4 1,2,3,4

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected],[email protected],[email protected]
3,saurabh69912162@gm ail.com4

ABSTRACT American politics suffered from a great set back due to fake news. Fake news is intentionally written to mislead the audience to believe the false propaganda, which makes it difficult to detect based on news content. The fake news has hindered the mindset of the common people. Due to this widespread of the fake news online it is the need of the hour to check the authenticity of the news. The spread of fake news has the potential for extremely negative impact on society. The proposed approach is to use machine learning to detect fake news. Using vectorisation of the news title and then analysing the tokens of words with our dataset. The dataset we are using is a predefined curated list of news with their property of being a fake news or not. Our goal is to develop a model that classifies a given article as either true or fake. General Terms Fake News, Self Learning, Pattern Matching, Response Generation, Artificial Intelligence, Natural Language Processing, Context Free Grammar, Term Frequency Inverse Document Frequency, Stochastic Gradient Decent, Word2Vec Keywords Natural language processing, Machine learning, Classification algorithms, Fake-news detection Detection, Filtering. After the election results, these fake news 1.INTRODUCTION This project emphasises on providing had made its prominent way into the solutions to the community by providing a market. These have also led into the reliable platform to check the Authenticity exclusion of Britain from the European of the news. The project Fake News Union i.e Brexit. During the Brexit time Detection using Machine Learning the same fake news propaganda was revolves around discovering the carried on the internet and due to this a probability of a news being fake or real, mentality is developed among people that Fake News mainly comprises of one option is better than another thus maliciously-fabricated News developed in leading into the manipulation of the order to gain attention or create chaos in decision of the public and hindering the the community. importance of the democracy. Thus the In 2016 American election the propaganda very foundation on which the countries are carried on by the Russian hackers had the operating is disturbed and people don‘t drastic effect on the country, few had know whom to believe and whom to not supported for President Trump while thus the belief system of democratic others didn‘t but still, due to the spread of countries are compromised and people the fake news against both presidential began to think on their own decision candidates Trump and Clinton there was whether they took the decision was right or an uproar in the public and moreover the not or the influence of this news was the spread of these fake news on the social cause of it? Thus the paper deals with media had a drastic impact on the lives of tackling the situation of fake news which the Americans. has the power to shatter the whole ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 194

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

economy of the world and create a ― Great Fall‖. 2.MOTIVATION Fake news mostly spreads through the medium of social networking sites such as Facebook, Twitter and several others.Fake news is written and published with the intent to mislead in order to damage a person, and/or gain financially or politically.A litany of verticals, spanning national security, education and social media are currently scrambling to find better ways to tag and identify fake news with the intention of protecting the public from deception. Our goal is to develop a reliable model that classifies a given news article as either fake or true. Recently Facebook has been at the centre of much critique following media attention. They have already implemented a feature for their users to check fake news on the site itself, it is clear from their public announcements that they are actively researching their ability to distinguish these articles in an automated way. Indeed, it is not an easy task. A given algorithm should be politically unbiased – since fake news exists on both ends of the spectrum – and also give equal balance to legitimate news sources on either end of the spectrum.We need to determine what makes a new site 'legitimate' and a method to determine this in an objective manner. 3.LITERATURE SURVEY 1. Mykhailo Granik, Volodymyr Mesyura, ―Fake News detection using Naïve Bayes, 2017 ‖ , proposed an approach for detection of fake news using Naïve Bayes classifier with accuracy of 74% on the test set. Sohan Mone, Devyani Choudhary, Ayush Singhania, ―Fake News Identification, 2017‖ proposed system calculates the probability of a news being fake or not by applying NLP and making use of methods like Naïve Bayes, SVM, Logistic Regression. ISSN:0975-887

Sholk Gilda ―Evaluating Machine Learning Algorithms for Fake News Detection, 2017‖ proposed system make use of available methods like Support Vector Machines, Stochastic Gradient Descent, Gradient Boosting, Bounded Decision Trees, and Random Forests in order to calculate best available way to achieve maximum accuracy. Sakeena M. Sirajudeen, Nur Fatihah a. Azmi, Adamu I. Abubakar, ―Online Fake News Detection Algorithm, 2017‖ The proposed approach is a multi-layered evaluations technique to be built as an app, where all information read online is associated with a tag, given a description of the facts about the contain. Ver´onica P´erez-Rosas, Bennett Kleinberg, Alexandra Lefevre Rada Mihalcea, ―Automatic Detection of Fake News, 2017‖, proposed system does comparative analyses of the automatic and manual identification of fake news. 4.GAP ANALYSIS Table 1. Comparison of existing and proposed system

Sr Existing System no. syste uses This m idf

1

2

Proposed System

tf-

This system will use encodin wit statistic w i k i p e d F a s t Te g h al ia xt machine Word2Vec learning. Embeddings. Machine Lear ning M a c h i n L e a r n i concep e ng ts such as Self concept suc Learning s h as Self along with Learnin and Pattern L o n g S h oT e r m

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 195

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

rt SYSTEM FEATURE 1 – NEWS are not M e m o r y ( R e c u r GATHERING rent Neural We gathered random news on various Networks). articles with different subjects to train our model. By studying these, System detects news intent using machine learning syste performe algorithm. Pre Labelled news are used to This m d new This system train our models. The Accurate and Best performing model is selected for our well on new s but outperforms predictions. The pre labelled data that we lack performanc the existing collected is form a reliable resource such s e with system as Kaggle. The news collected also complex contains the class attribute with its news. corresponding values either true or false on the basis of which it will be determined Table 1 - Gap Analysis whether the news is true positive, true negative or false positive, false negative. The class attribute helps in producing the confusion metrics through which attributes like precision, recall etc are calculated in order to evaluate accuracy of the model. The proposed model initially consist of 10,000 different news articles and their corresponding class attributes. Once the news is gathered the model goes to the next feature. g Matching used.

3

SYSTEM FEATURE 2 - COMPLEX NEWS HANDLING Fig 2 - LSTM

Fig 3 - Naïve Bayes

5.PROPOSED WORK

ISSN:0975-887

System will analyse complex news which can be difficult for traditional model. Following steps are required for handling of the complex news, which are as follows Tokenising, padding, encoding, Embedding matrix formation, Model Formation, Model Training and Finally predicting the model. The process starts with the tokenising of the input news which is present in the LIAR dataset. The dataset we are using consists of 10,000 news articles with class attribute of each article. In the next process each article/news is taken and is tokenised, in the tokenisation process all the stop words are removed as well as stemming and lemmatisation is also performed.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 196

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Second Stage is thr Padding the tokens of variable length for this, pad_sequences() function in the Keras deep learning library can be used to pad variable length sequences.The default value is 0.0, which is suitable for almost every application, although this can be changed by specifying the preferred value via the ―value‖ argument. The padding to be applied at first or the end of the sequence, called preor post-sequence padding, can be called the ―padding‖ argument.

One way is to create co-occurrence matrix. A co-occurrence matrix is a matrix that consist of number counts of each word appearing next to all the other words in the corpus (or training set). Let‘s see the following matrix.

Text data requires special preparation before you can start using it for predictive modelling. The text must be parsed to remove words, called tokenisation. Then the words need to be encoded as integers or floating point values for use as input to a machine learning algorithm, called Text Encoding. Once this process of encoding is completed then the text or tokens gets ready for the embedding process.

We are able to gain useful insights. For example, take the words ‗love‘ and ‗like‘ and both contain 1 for their counts with nouns like NLP and dogs. They also have 1‘s for each of ―I‖, which indicates that the words must be some sort of verb. These features are learnt by NN as this is a unsupervised method of learning. Each of the vector has several set of characteristics. For example let‘s take example, V(King) -

Embedding is representation for text where words that have the same meaning have similar representation. It is a approach to represent words and documents that may be considered one of the key development of deep learning on challenging natural language processing problems.This transformation is necessary because many machine learning algorithms require their input to be in vectors of continuous values; they just won‘t work on strings of plain text. So natural language modelling techniques like Word Embedding which is used to map words and phrases from vocabulary to a corresponding vector of real numbers. Word2Vec model is used for learning vector representations of a particular words called ―word embeddings‖. This is typically done as preprocessing step, after which the learned vectors are feed into a model mostly RNN inorder to generate predictions and perform all sort of interesting things.We will be filling the values in such a way that the vector somehow represents the word and its context, meaning, or semantics.

V(man) + V(Women) ~ V(Queen) and each of word represents a 300-dimension vector. V(King) will have characteristics of Royalty, kingdom, human etc in the vector in specific order. V(Man) will have masculinity, human, work in specific order. When V(King)-V(Man) is done, masculinity, human characteristics will get NULL and when added with V(Women) which having femininity, human characteristics will be added thus resulting in a vector similar to a V(Queen). The interesting thing is that these characteristics are encoded in the vector form in a specific order so that numerical computations such as addition, subtraction works perfectly. This is because of the nature of unsupervised learning.

Table 2 - Word Embedding Table

ISSN:0975-887

SYSTEM FEATURE 3 – FAST TRAINING OF NEW DATA ON GPU The Proposed System uses Nvidia GPU using CUDA architecture and thus the training of complex real time news becomes easy and faster.Keras

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 197

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

automatically uses the GPU wherever and whenever possible with the help CuDNNLSTM, which is a high level deep learning keras and tensor-flow neural network which runs the model on GPU (Nvidia gpu) using CUDA technology. CUDA is NVIDIA's parallel computing architecture that enables dramatic increases in computing performance by harnessing the power of the GPU (graphics processing unit).Fast LSTM implementation backed by CuDNN. The execution of model training gets faster by 12 to 15 % depending on data. 5.1 FIGURES/CAPTIONS This diagram depicts the actual working of the proposed system and all the functionalities it will perform.Model formation for fake news detection make use of the training and the test data set and some other parameters like the dimensions of the vector space where it hold the relation between the two or more news entities. All these data is set to pass into the main function which is thought to generate the confusion metrics and present the result in terms of percentage.

Fig 5 - Working of proposed model

Initially the system stores the gathered news in database which is then retrieved by the model, which then processes the training data and produces the classifier. The user is supposed to enter the news manually which is thought to be unverified, once the input is given via

ISSN:0975-887

Web-portal it then reaches out to the model in backend who process and gives output. The news given by the user is taken as a test set or test case and is sent to classifier which classifies it. 6.CONCLUSION The circulation of fake news online not only jeopardises News Industry but has been negatively impacting the user‘s mind and they tend to believe all the information they read online. It has power to dictate the fate of a country or even whole world. Daily decision of public also gets affected. Applying the projected model would definitely help in differentiating between Fake and Real news. REFERENCES [1] Sadia Afroz, Michael Brennan, and Rachel Green- stadt. Detecting hoaxes, frauds, and deception in writ- ing style online. In ISSP‘12. [2] Hunt Allcott and Matthew Gentzkow. Social media and fake news in the 2016 election. Technical report, National Bureau of Economic Research, 2017. [3] Meital Balmas. When fake news becomes real: Com-bined exposure to multiple news sources and political attitudes of inefficacy, alienation, and cynicism. Com-munication Research, 41(3):430–454, 2014. [4] Alessandro Bessi and Emilio Ferrara. Social bots dis- tort the 2016 US presidential election online discussion. First Monday, 21(11), 2016. [5] Prakhar Biyani, Kostas Tsioutsiouliklis, and John Blackmer. ‖ 8 amazing secrets for getting more clicks‖: Detecting clickbaits in news streams using article in-formality. In AAAI‘16. [6] Thomas G Dietterich et al. Ensemble methods in ma-chine learning. Multiple classifier systems, 1857:1–15, 2000. [7] kaggle Fake News NLP Stuff. https://www.kaggle.com/ rksriram312/fakenews-nlp-stuff/notebook. [8] kaggle All the news .https://www.kaggle.com/snapcrack/ all-thenews. [9] Mykhailo Granik, Volodymyr Mesyura, ―Fake News detection using Naïve Bayes, 2017 ‖ [10] Sohan Mone, Devyani Choudhary, Ayush Singhania, ―Fake News Identification, 2017‖.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 198

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

COST-EFFECTIVE BIG DATA SCIENCE IN MEDICAL AND HEALTH CARE APPLICATIONS Dr S T Patil1,Prof G S Pise2

2 Department

1 Department

of Computer engineering, VIT, Pune. of Computer Enginering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected],[email protected]

ABSTRACT Big-Data can play important role in data science and Healthcare Industries to manage data and easily utilize all data in a proper way with the help of ―V6s‖ (Velocity, Volume, Variety, Value, Variability, and Veracity). The main goal of this paper is to provide bottomless analysis on the field of medical science and healthcare data analysis and also focused of previous strategies of healthcare as well as medical science. The digitization process is participated in the medical science (MS) and Healthcare Industry (HI) hence it produces massive data analysis of all patient related data to get a 360degree point of view of the patient to analyze and prediction. It helps to improve healthcare activities like, clinical practices, new drugs development and financial process of healthcare. It helps for lots of benefits in healthcare activities such as early disease detection, fraud detection, and better healthcare quality improvement as well as efficiency. This paper introduces the big data analytics techniques and challenges in healthcare and its benefits, applications and opportunities in medical science and healthcare. General Terms Hadoop, Map-Reduce, Healthcare Big-Data, Medicals, Pathologist. Keywords Healthcare Industry (HI), R, Data Analytics (DA), Smart-Health (SH). 1. Volume: It means data size is big/huge like, 1. INTRODUCTION Terabytes(TB), Petabytes(PB), Zeta bytes(ZB) The main goal of this paper is to provide etc, best predictive analysis solution to 2. Velocity: It means data can be generated in researchers, academicians, healthcare high –speed, per day data generated, per hour, industries and medical science industries, per minute, and per second etc. who have a lots of interest into analytics of 3. Variety: It means data can be represented in big-data for a specific healthcare and different types like, structural data, medical science industries. unstructured data and semi structured data for We know that all healthcare Industries and example, data from email messages, articles, Medical Science Researcher are dependent streamed videos and audios etc. on data for analysis and processing on it,4. Value: It means data have some valuable information insight within it. This will be and that data are generated from useful information somewhere within the data Government Hospitals and Private Clinic for outcomes. collaborative record of every old and new 5. Variability: It means data can be changes patient‘s data, which is in the form of during the processing; it may be producing different structure known as big-data. So some unexpected, hidden and valuable big-data can be processed and identified information from that data. with the help of big-data characteristics. Veracity: It has to be focused on two terms: We can say that big-data with V6‘s Data Trustworthiness and Data Consistency, (Volume, Velocity, Variety, Value, we can also says that data is in doubt means, Variability, Veracity) Characteristics to Ambiguity, Incompleteness and uncertainty achieve dedicated outcomes. due to data inconsistency. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 199

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

The consideration of all healthcare industries and medical science researchers about big-data, it has some false information and noisy data but all collaborative data has correlative so that we have apply corrective big-data handling approach to achieve outcomes from that data[1]. If we consider big-data has today and may not be tomorrow due to advances in healthcare industries and medical science but data generation will not stop and day by data is going to generate rapidly and it is difficult to manage it because day by day human beings requirement is increasing with their standards. But optionally we can say that if all types of data combined together with collaboratively at one location then it is not difficult to process it and manage it. Diagram create-big-data characteristics and healthcare. 2. HEALTH CARE DATA The Big-data in healthcare industries and medical science refers to health data sets it is complex and hence it is difficult to1. manage with common data management methods as well as traditional software and hardware method [2]. The healthcare industries and medical science researchers are mostly reliant on Pharmacist, Hospitals, Radiologist, Druggist, Pathologist, as well as any other web services based applications which are2. related to healthcare and management. It is mandatory for every country, the health care process making as a beautiful way to digitalize which will helpful for data analysis and processing for the healthcare industries. In any government hospital or3. private hospital/clinic every new patient registration is supposed to be recorded in Electronic Registration System (ERS), and they need to be issued a secure chip-based data card, so that their record can be updated in various department which will helpful to identify previous past record and symptoms and other formalities don by previous doctor with all details [2]. ISSN:0975-887

The advancement of pathological process with Digital Clinical Observation System (DCOS), radiology and last but not least Robotic Guided Healthcare System (RGHS), etc can generate records which is consisting of databases dumps, texts, images as well as videos. These data can be planned for a particular location collaboratively to achieve expected outcomes of Context based Retrieval System (CRS) and accurate analytics process which will helpful to provide costeffective and fast service to individual patient and healthcare management [3]. The collaborative data leads towards large amount of data or volume of data with different structural view and hence in previous section we have introduced characteristics of big-data. The volume of data created may be in the form of structured data or unstructured type data, those data can be stored, manipulated, recalled, analyzed or queried by using electronic machine. There are various types of data can be used in healthcare, it has categorized as follows, 1. Genomic Data

This type of Genomic data refers to the genome and DNA data of an organism. They are used in bioinformatics for collecting, storing and processing the genomes of living things. Genomic data generally require a large amount of storage and purpose-built software to analyze [1]. 2. Clinical Data and Clinical Notes In this data approximately 80% data is unstructured data with documents, images, clinical or prescription notes. Structural data is also available like laboratory data, structured EMR/HER [1]. 3. Behavioral Data and Patient Sentiment Data In this category generally data can be considered search engine, internet consumer uses and networking sites like, Facebook, Twitter, LinkedIn, blogs, health plan websites and Smartphone‘s, etc. 4. Administrative, Business and External Data In this category data comes from insurance claim and related financial data, billing and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 200

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

scheduling also biometric data can be considered like, Fingerprints, Handwriting and Iris scan etc [1].

3. HEALTHCARE PATIENT RECORD CHALLENGES In any hospital or private clinic, big challenge is to manage and analysis of bigdata of any new or existing patient. The electronic record of patient can be composed in structured and semistructured data and instrumental recording for health test, while unstructured data consisting of handwritten notes, patient‘s admission and reliving records, prescriptions records etc, also the data may be web-based, machine–based, biometricbased and also data generated by human (like, Twitter, Facebook, Sensors, Remote Devices, Fingerprints, X-RAY, Scanning, EMRs, Mails etc.) these conventional records and digital data are combined in Healthcare Big-Data (HBD). The execution of big-data is the most challenging task hence, most of the researcher suggested for installation of big-data tools in the standalone system. The big data is generally considered as voluminous data and when that data is processing and execution of that data should be on the distributed nodes. Hence we need some knowledge about data analysis techniques and healthcare decision in better way which will help for active enhancement. For processing and analysis we have some open source tools of distributed data processing [6]. The big data in healthcare science and industry is changing the way of patients and doctors healthcare system because voluminous data is involved it affects on more efficient and scalable healthcare, so it can be useful for every patients and hospital to handle that data of each and every patient record easily. The big data is generally has huge voluminous data and it has been processing, the execution is carried out in distributed nodes. We know that for processing and execution of any voluminous data from ISSN:0975-887

distributed system for that mostly recommended Big-Data Analytic (BDA) tools, without any doubt the analysis tools of healthcare it is beneficial and useful. 4. BIG-DATA ANALYTIC TOOL In healthcare industry big problem is to processing and execution of data and also all hospital as well as clinic suffering with same to manage big data of patients and its processing and execution is difficult task, so that Big-Data Analytics tools plays important role to process it easily in two different ways centralized and distributed ways[1]. The BDA tools are naturally complex in nature with widespread programming and multi-skill applications combined together under one roof, so that it is not userfriendly and its complexity of the process will take place with the data itself. For this system different types of data need to be combined then raw data is transformed for multiple availability points. In healthcare industry how big-data is supporting entire industry, actually it has a benefits from these initiatives. In this paper we have focused on three area of big data analytics, they are intended to provide a perspective of broad and popular research areas where the concept of bigdata analytics are currently being applied. These areas are as 1. Healthcare industry aspect with BDA, 2. Impact of Big-Data in Healthcare, 3. Opportunities and Applications of Big data in Healthcare. 4.1 Healthcare Industry Aspect with BDA The healthcare industry system is not only one of the largest industries it is also one of the most complex in nature, with many patients constantly demanding better care management. Big-Data in healthcare industry along with industry analytics have made a mark on healthcare, but one important point should be focused here is security concern and requires better skill programming aspect as end user skills are not proposed. The healthcare industry has

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 201

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

some limitations in big data like, security, privacy; ownership and its standard are not proposed yet. 4.2 Impact of Big-Data in Healthcare Industry In healthcare industry big data changed everything with respect to data processing and execution including hospital and clinic. Here we have focused some relevant connections on information [1]. 4.2.1 High Risk Patient Care We know that healthcare cost and complications always increasing lots of patient in emergency care. Due to higher cost it is not beneficial for poor patients and so many patients are not taking benefit, so implementing the change in this department will be advantage and hospital will work properly [1]. If all records are digitized, patient patterns can be identified more effectively and quickly, it will directly help to reduce time of checking and applying proper treatment to that patient also it will help checking on patients with high risk of problem and ensuring more effective, customized treatment can be benefitted. Lack of data makes the creation of patient-centric care programs more difficult, so one care clearly understand why big data utilization is important in healthcare industry. It wills clearly identifies and process with zero error in execution flow of patient checking and maintained of record of patient with all treatment details and hence big data analytics tools need in healthcare industry [3]. 4.2.2 Cost Reduction Generally we know that various hospitals, clinic and medical institutions are faced high level financial waste, due to improper financial management. If happens because of over booking of staff. Through predictive analysis, this specific problem can be solved, being far easier to access help for effective allocation of staff together with admission rate prediction [7, 8]. Hospital investments will thus be optimized, reducing the investment rate when necessary. The insurance industry ISSN:0975-887

will also gain a financial advantage, by backing health trackers as well as wearable to make sure patients don‘t actually over exceed their hospital stay. Patients could also benefit from this change, lowering their waiting time, by having immediate access to staff and beds. The analysis will reduce staffing needs and bed shortages [4]. 4.2.3 Patient Health Tracking We have focus on Identifying potential health problems before they develop and turn into aggravating issues is an important goal for all organizations functioning in the industry. Due to lack of data, the system has not always been able to avoid situations that could have easily been prevented otherwise. Patient health tracking is another strong benefit that comes with big data, as well as The Internet of Things tech resources [2]. 4.2.4 Patient Engagement could be Enhanced Through big Data and analytics, an increase in patient engagement could also be obtained. Drawing the interest of consumers towards wearable and various health tracking devices would certainly bring a positive change in the healthcare industry, a noticeable decrease in emergency cases being potentially reached. With more patients understanding the importance of these devices, physicians‘ jobs will be simplified, and an engagement boost could be obtained through big data initiatives, once again [2, 3]. 4.3 Opportunities and Applications of Big-Data in Healthcare and Medical Industry We have mentioned in previous first section and second section of this regarding big data about its role, big data can provide major support with all different aspect in healthcare. We know that big data analytics (BDA) has gained traction in genomics, clinical outcomes, fraud detection, personalized patient care and pharmaceutical development; likewise there are so many potential applications in

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 202

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

healthcare and medical science areas some of these applications are given in 4.2 impact of big data in healthcare industry. Following table shows that some of the important application area of big data in healthcare industry and medical science. Applicatio Business Big Data n Areas Problems Types Healthcare Fraud Machine detection generated, Transaction data, Human generated Healthcare Genomic Electronic health record, Personal health record Healthcare Behavioral Facebook, and Patient Twitter, Sentiment LinkedIn, data Blogs, Smartphone’s . Science Utilities: Machine and Predict generated Technolog Power data y Consumptio n Table 1: Big data Applications in Healthcare

5. TECHNOLOGY AND METHODOLOGY PROGRESS IN BIG DATA In every field big data plays important role with big data analytics tools, but here we have focused in healthcare/ medical science field. In medical and healthcare field, large amount of data can be generated about patient‘s medical histories, symptoms, diagnosis and responses to treatments and therapies collected. Data mining are some time used here for fining interesting pattern from healthcare data with analytics tools with the help of Electronic Patient Record (EPR) of each patient [1]. ISSN:0975-887

For healthcare system big data, the Hadoop with Map Reduce framework is mostly suitable for storing wide range of healthcare data types including electronic medical records, genomic data, financial data, claims data, etc. It has high scalability, reliability and availability than traditional database management system. The Hadoop Map reduce system will increase the system throughput and it can process huge amount of data with proper execution, so that it is helpful for healthcare industry and medical science [5]. The big data analytics tools are widely considered for complex type applications and it has widely used in healthcare industry to manage all type of data under one roof with distributed architecture. In following architecture we have given basic idea about different coming sources of big data it can be considered as a raw data like, External, Internal, Multiple Locations, Multiple Formats, and Applications [5, 6]. Raw data from different sources can be transformed on middleware with Extract, transform, Load (ETL) in the form of traditional format. With transformed data we use big data platforms and tools to process and analytics it. Then we use actual bid data analytics applications [2].

In this architecture we have shown all different areas of application coved with big data analytics tools, but here we have

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 203

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

more focused on healthcare industry and medical science applications. 6. BIG DATA CHALLENGES IN HEALTHCARE We know that big data characteristics i.e. V6, it is difficult to storage big amount of data also difficult to search, visualize, retrieval, and curation. There are so many challenges in healthcare application some of the major challenges in healthcare are listed below [4]. 1. It is difficult to analyze and aggregate unstructured data from different hospital and clinic from ERM, Notes, and Scan etc. 2. The data which is provided by many hospital and clinic are not accurate with quality factors also, so it is difficult to analyze sometime with BDA. 3. Analyzing genomic data computationally difficult task. 4. Data hackers can damage big data. 5. Information Security is a big challenge in big data.

7. CONCLUSION AND FUTURE RESEARCH Big Data has lots of challenges in healthcare and medical science, due to lack of infrastructure, skills, privacy and information security, also data processing and its execution it is difficult in present system. In hospital and clinic they are not maintain daily updation and lack of machineries of diagnosis with manual process, due to these manual processes it is difficult to handle each and every patient properly in given time span and hence sometimes actual diagnosis and treatment not getting to that patient. Generally many small and medium hospitals are offering manual process with documented prescription, so it is difficult to carry all prescription next time when we have given appointment to visit in hospital at that time it is difficult to carry everything properly and keep it safe in our home, so if all record kept in electronic record at hospital then it will very easy to find out every patient information quickly and it will also help for improvement of quality of ISSN:0975-887

treatment and hospital activity with all doctor management with prior appoint of every patient with department wise. These challenges are mostly considered for future research with Big Data Analytics tools role in healthcare industry and medical science like sensor data and electronic records of patient data privacy-preserving with data mining. In healthcare this type of changes is necessary for sentiment analysis of big data in healthcare science with patient personalized data and behavioral data. But researcher point of view big data is the best solution for healthcare industry and medical science. In future we know that data will be generating rapidly, so future generation healthcare big data will apply with vast application of healthcare industry and society. In this paper we have given many tools of BDA for healthcare industry as a solution and it will establish an efficient and cost effective quality management using data cluster manager. REFERENCES [1] Lidong Wang and Cheryl Ann Alexander. ―Big data in Medical and Healthcare‖, Department of Engineering Technology, Mississippi valley State University, USA, 2015 [2] A. Widmer, R. Schaer, D. Markonis, and H. Muller, ―Gesture interaction for content-based medical image retrieval,‖ in Proceedings of the 4th ACMInternational Conference on Multimedia Retrieval, pp. 503–506, ACM, April 2014. [3] Weiss, G., "Welcome to the (almost) digital hospital," in Spectrum, IEEE, vol.39, no.3, pp.44-49, Mar 2002. [4] Jun-ping Zhao, "Electronic health in China: from digital hospital to regional collaborative healthcare," in Information Technology and Applications in Biomedicine, 2008. ITAB 2008. International Conference on, vol., no., pp.26-26, 30-31 May 2008. [5] Raghupathi, Wullianallur, and VijuRaghupathi. "Big data analytics in healthcare: promise and potential." Health Information Science and Systems2.1, 2014. [6] Srinivasan, U.; Arunasalam, B., "Leveraging Big Data Analytics to Reduce Healthcare Costs," in IT Professional, vol.15, no.6, pp.2128, Nov.-Dec. 2013. [7] Hongsong Chen; Bhargava, B.; Fu Zhongchuan, "Multilabels-Based Scalable

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 204

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Access Control for Big Data Applications," in Cloud Computing, IEEE, vol.1, no.3, pp.6571, Sept. 2014.

ISSN:0975-887

[8] A. McAfee, E. Brynjolfsson, T. H. Davenport, D. J. Patil, and D. Barton, ―Big data: the management revolution,‖ Harvard Business Review, vol. 90, no. 10, pp. 60–68, 2012.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 205

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

AI – ASSISTED CHATBOTS FOR E-COMMERCE TO ADDRESS SELECTION OF PRODUCTS FROM MULTIPLE CATEGORIES Gauri Shankar Jawalkar1, Rachana Rajesh Ambawale2, Supriya Vijay Bankar3, Manasi Arun Kadam4, Dr. Shafi. K. Pathan5, Jyoti Prakash Rajpoot6 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. 6 Department of ECE, HMR Institute of Technology and Managmenet Delhi [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT Artificial Intelligence has been used to develop and advance numerous fields and industries, including finance, healthcare, education, transportation, and more. Machine Learning is a subset of AI techniques that gives machines the ability to learn from data or while interacting with the world without being explicitly programmed. Ecommerce websites are trending nowadays due to online shopping makes customer‘s life easier. Similar to this, Chatter Robots i.e. ChatBots are providing better customer service through Internet. A chatbot is a software program for simulating intelligent conversations with human using rules or artificial intelligence. Users interact with the chatbot via conversational interface through written or spoken text. With the help of Ecommerce website sellers can reach to larger audience and with the help of chatbots, sales can be increased by personal interaction with the users. Chatbots will welcome a user to conversation, guide to customer to make purchase which will reduce customer‘s struggle. Chatbots will ask customers all the relevant questions to find the perfect fit, style, and color for them. Chatbots are the future of marketing and customer support. Chatbots are one such means of technology which helps humans in a lot of ways, by helping them increase sales whilst providing great customer satisfaction. Keywords: online shopping, e-commerce, chatbot, customers, machine learning, artificial intelligence, NLP 1. INTRODUCTION used business paradigm. More and more

With the development of internet technology, network service plays an increasingly important role in people‘s daily life. People expect that they can get the satisfied service or goods in a convenient way and in very short time. Hence, the electronic commerce system at this moment plays a very critical part. On one hand, it is very convenient for people to look at the goods online and it also shortens people‘s time period for shopping. On the other hand, for the enterprise, it shortens intermediate links, and it can reduce the geographic restrictions and decreases the merchandise inventory pressure, therefore, it can greatly save business operating cost. E-commerce is fast gaining ground as an accepted and ISSN:0975-887

business houses are implementing web sites providing functionality for performing commercial transactions over the web. It is reasonable to say that the process of shopping on the web is becoming common place. An online store is a virtual store on the Internet where customers can browse the catalogue and select products of interest. The selected items may be collected in a shopping cart. At checkout time, the items in the shopping cart will be presented as an order. At that time, more information will be needed to complete the transaction. Usually, the customer will be asked to fill or select a billing address, a shipping address, a shipping option, and payment information such as credit card

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 206

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

number. An e- mail notification is sent to the customer as soon as the order is placed. A chatbot is a software program for simulating intelligent conversations with human using rules or artificial intelligence. Users interact with the chatbot via conversational interface through written or spoken text. Chatbots will welcome a user to conversation, guide to customer to make purchase which will reduce customer‘s struggle. Chatbots are the future of marketing and customer support. Chatbots are one such means of technology which helps humans in a lot of ways, by helping them increase sales whilst providing great customer satisfaction. With the help of Ecommerce website sellers can reach to larger audience and with the help of chatbots, sales can be increased by personal interaction with the users. Digitization, the rise of the internet and mobile devices has changed the way people interact with each other and with companies. The internet has boosted electronic commerce (ecommerce) and the growth of wireless networks and mobile devices has led to the development of mobile e-commerce. Artificially intelligent chatbots or conversational agents can be used to automate the interaction between a company and customer. Chatbots are computer programs that communicate with its users by using natural language and engages in a conversation with its user by generating natural language as output. The application of chatbots by businesses is no new development itself. Chatbots have been around in online web based environments for quite some time and are commonly used to facilitate customer service. Chatbots can respond with messages, recommendations, updates, links or call-to-action buttons and customers can shop for products by going through a product carousel, all in the messenger interface. A chatbot can recognize the buyer‘s intent and refine offerings based on the buyer‘s choices and preferences. It can then facilitate the sale, order, and delivery process. ISSN:0975-887

1.1 Motivation There are various systems available with chatbots which are currently in use. The available systems are only related to few categories, such as starbucks is related to food category i.e. coffee and snacks, Sephora is related is associated with makeup material and makeup tutorials. Due to poor memory, Chatbots are not able to memorize the past conversation which forces the user to type the same thing again & again. This can be cumbersome for the customer and annoy them because of the effort required. Due to fixed programs, chatbots can be stuck if an unsaved query is presented in front of them. This can lead to customer dissatisfaction and result in loss. It is also the multiple messaging that can be taxing for users and deteriorate the overall experience on the website. 2. LITERATURE REVIEW 2.1 Related Work

Anwesh Marwade, Nakul Kumar, Shubham Mundada, and Jagannath Aghav have published a paper ―Augmenting ECommerce Product Recommendations by Analyzing Customer Personality‖ in 2017 in which they had focused on customer specific personalization. The e-commerce industry predominantly uses various machine learning models for product recommendations and analyzing a customer‘s behavioral patterns. With the help of e commerce based conversational bot, the personality insights to develop a unique recommendation system can be utilized based on order history and conversational data that the bot-application would gather over time from users. Adhitya Bhawiyuga, M. Ali Fauzi, Eko Sakti Pramukantoro, Widhi Yahya have published a paper ―Design of ECommerce Chat Robot for Automatically Answering Customer Question‖ in 2017 in which they had focused on design and implementation of e-commerce chatbot system which provides an automatic

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 207

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

response to the incoming customer-toseller question. The proposed system consists of two main agents : communication and intelligent part which can deliver the automatic answer in less than 5 seconds with relatively good matching accuracy. S. J. du Preez1, M. Lall, S. Sinha have published a paper ―An Intelligent Web-Based Voice Chat Bot‖ in 2009 in which they are presenting the design and development of an intelligent voice recognition chat bot. The paper presents a technology demonstrator to verify a proposed framework required to support such a bot (a web service). By introducing an artificial brain, the web-based bot generates customized user responses, aligned to the desired character. Bayu Setiaji, Ferry Wahyu Wibowo have published a paper, ―Chatbot Using A Knowledge in Database‖, in 2017 in which they describe about a chatterbot or chatbot aims to make a conversation between both human and machine. The machine has been embedded knowledge to identify the sentences and making a decision itself as response to answer a question. The response principle is matching the input sentence from user. From input sentence, it will be scored to get the similarity of sentences, the higher score obtained the more similar of reference sentences. Godson Michael D‘silva, Sanket Thakare, Sharddha More, and Jeril Kuriakose have published a paper ―Real World Smart Chatbot for Customer Care using a Software as a Service (SaaS) Architecture‖ in 2017 in which they

proposed a system architecture which will try to overcome the above shortcoming by analyzing messages of each ejabberd users to check whether it‘s actionable or not. If it‘s actionable then an automated Chatbot will initiates conversation with that user and help the user to resolve the issue by providing a human way interactions using LUIS and cognitive services. To provide a highly robust, scalable and extensible architecture, this system is implemented on AWS public cloud. Cyril Joe Baby, Faizan Ayyub Khan, Swathi J. N. have published a paper ―Home Automation using IoT and a Chatbot using Natural Language Processing‖ in 2017 in which they focused on a web application using which the fans, lights and other electrical appliances can be controlled over the Internet. The important features of the web application are that firstly, we have a chatbot algorithm such that the user can text information to control the functioning of the electrical appliances at home. The messages sent using the chatbot is processed using Natural Language processing techniques. Secondly, any device connected to the local area network of the house can control the devices and other appliances in the house. Thirdly, the web application used to enable home automation also has a security feature that only enables certain users to access the application. And finally, it also has a functionality of sending an email alert when intruder is detected using motion sensors. 2.2 Literature Review Analysis

Table 1 Literature Review Analysis

Title of Paper Real World Smart Chatbot for Customer Care using a Software as a Service (SaaS) Architecture

ISSN:0975-887

Author Godson Michael D’silva, Sanket Thakare, Sharddha More, and Jeril Kuriakose

Publication Year

2017

Key Points 1.Respond to actionable messages. 2.Initiate conversation and help to solve issue. 3.Implemented on

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 208

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

AWS public cloud. Chatbot Using A Bayu Setiaji, Ferry 1.Conversation Knowledge in Database Wahyu Wibowo between human and machine 2. Use of response 2017 principle 3. Machine identify sentence & make decision as response Home Automation using Cyril Joe Baby, 1.Web application IoT and a Chatbot using Faizan Ayyub Khan, to control home Natural Language Swathi J. N appliances Processing 2.Message sent to 2017 chatbot for control Security and network connectivity Design of E-Commerce Adhitya Bhawiyuga, 1. Provides Chat Robot for M. Ali Fauzi, Eko automatic response Automatically Sakti 2. Use of Answering Customer Pramukantoro, 2017 communication and Question Widhi Yahya intelligent agent 3. Good pattern matching accuracy Augmenting EAnwesh Marwade, 1. Use of ML models Commerce Product Nakul Kumar, Analyzes customers’ Recommendations by Shubham behavioral pattern Analysing Mundada, and 2017 2. Utilizes Customer Personality Jagannath Aghav personality insights(order history) An Intelligent WebS. J. du Preez1, M. 1. Proposed Based Voice Chat Bot Lall, S. Sinha framework to support web based bot 2009 2. Use of Artificial Brain to generate customized user responses 1)The available systems are only related to 2.3 Existing Systems There are various systems available few categories, such as starbucks is related with chatbots which are currently in use. to food category i.e. coffee and snacks, Though these chatbots assisted systems are Sephora is related is associated with in use there are some limitations makeup material and makeup tutorials. associated with it. The limitations could 2)Poor memory can be a disadvantage to be,the system. Due to poor memory, Chatbots will not able to memorize the past ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 209

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

conversation which forces the user to type the same thing again & again. This can be cumbersome for the customer and annoy them because of the effort required. 3)Due to fixed programs, chatbots can be stuck if an unsaved query is presented in front of them. This can lead to customer dissatisfaction and result in loss. It is also the multiple messaging that can be taxing for users and deteriorate the overall experience on the website. Chatbots are installed with the motive to speed-up the response and improve customer interaction. However, due to limited data-availability and time required for self-updating, this process appears more time-taking and expensive. Therefore, in place of attending several customers at a time, chatbots appear confused about how to communicate with people. Starbucks Starbucks has developed an Android and iOS application to place an order for favorite drink or snacks. The order can be placed with help of voice commands or text messaging. Spotify Spotify chatbots allow users to search for, listen to their favorite music. Also it allow users to share music. Whole Foods Whole Foods is related to groceries and food material. It allow to search for grocery items to shop for. Also it provides interesting recipes for users to try. Sephora Sephora is associated with makeup material such as foundation, face primer, concealer, blush, highlighter, etc. Sephora chatbots also suggest for makeup tutorials for which user is interested. Pizza Hut Pizza Hut chatbot can help customer to order pizza with favorite toppings and carryout delivery. A customer can reorder favorite pizza based on previous orders and can ask questions about current deals. SnapTravel

ISSN:0975-887

SnapTravel helps users to book hotels according to their convenient location and timings. A customer can also get to know about current deals available at various hotels and resorts. 1-800 Flower 1-800 Flower will help customers to gift flowers and gifts to someone for events like birthday, anniversary or any special occasion. It also offer suggestions for gifts to customers. These available chatbots are related to only few categories. To combine all the categories together at single place to integrate with chatbots for customer service. 3. PROPOSED WORK The Proposed System E-commerce with Chatbots will permit consolidation of customer login, browse and purchase products available, manage orders and payments, engaging customer with personalized marketing, qualifying recommendations based on history. The main users of the project are customers who want to shop various products and services online. From end-user perspective, the proposed system consists of functional elements: login module to access online products and services, browse and search products, purchase and pay for products, communicate to chatbot for better product and offer recommendation. According the back end logic, Natural Language Processing (NLP) will be used to understand messages sent by user through messaging platform. The Chatbot will launch an action as answer with real time information based on machine algorithms such supervised and unsupervised learning. The bot will improve with increasing number of messages received. The important features of the system are handling thousands of customers simultaneously which will provide better satisfaction to customers. Also it will be a virtual but personal

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 210

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

assistant for each customer. Similar to chatbots, e-Commerce becomes one of the preferred ways of shopping as they enjoy their online because of its easiness and convenience. The combination of ecommerce site with AI- assisted chatbots will provide better customer service to make profitable sales by personalized marketing. The risks associated such as privacy issues can be handled with the help of authentication and Authorization to provide strong access control measures. Intellectual property related risks can be avoided by proper instructions to upload data with restrictions. Online security is the most important risk to be considered while developing the system regarding to customers‘ credentials, online products and services available. Data storage could be a risk associated with chatbots as they store information to interact with the users. The best solution in this situation is to store the data in a secure place for a certain amount of time and to delete it after that. 3.1 User Classes and Characteristics

There are essentially three classes of users of the proposed system: the general users, the customers and the administrators. General users will be able to see and browse through products available to purchase, but they cannot buy the products and services. Customers are the users of the E-commerce System who will be able to browse, purchase, pay and add products and services to the cart with available functionality. Chatbots will help them to make a purchase decision based on various criteria and suggestions by chatbot algorithms. Also customers can write reviews or feedbacks on products and services they purchased. The administrators will be having advanced functionality to add, edit, update and delete products available in inventory. Also administrator will be able to authorize and authenticate the users logged into the system. Administrator will able to ISSN:0975-887

see daily sales and details about deliveries. He will be able to see the feedbacks or reviews given by the customers. 3.2 Assumptions and Dependencies There are few assumptions can be made while developing the proposed system:  A user has an active Internet Connection or has an access to view the Website.  A user runs an operating system which supports Internet Browsing.  The website will not be violating any Internet Ethic or Cultural Rules and won‘t be blocked by the Telecom Companies.  A user must have basic knowledge of English and computer functionalities.

3.3 Communication Interface The system should use HTTPS protocol for communication over the internet and for the intranet communication will be through TCP/IP protocol suite as the users are connected to the system with Internet interface. The user must have SSL certificate licensing registered web browser. 3.4 System Architecture Systems design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. Systems design could be seen as the application of systems theory to product development. There is some overlap with the disciplines of systems analysis, systems architecture and systems engineering. System architecture includes the modules used in the project and relationships between them based on data flow and processing. AI – Assisted Chatbots For E-Commerce System consists of following components: General User Customer Administrator

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 211

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Fig. 1 System Architecture

Fig. 2 System Architecture Components

The System Architecture shows the main components of the Proposed System. There are three main user classes of the system such as General User, Customer and Administrator. Along with these users the important components of the system are mentioned in diagram such as ECommerce Website Home Page, Product Categories, Inventory, Sales and Marketing, Shopping Cart, Purchase and Invoice Generation and Order Tracking Shipment. A General User is the basic component of the Proposed System who will be able to browse and search through the filters. General User will be directed to the E-Commerce Website Home Page with various products of different categories, deals of the specific products, Menu option to search various categories and log in to the system. Also Chatbot will interact with the customer with some basic standards of interaction. A General User can be Administrator with help of Seller Portal of the E-Commerce Website. Administrator will be the authorized person who will be ISSN:0975-887

able to handle products in inventory, sales of products, refunds to the customers, marketing of products, purchase and transactions made by customers, shipment of products ordered, records of invoices, inventory and customers, etc. Administrator must be authenticated as per well-defined rules and standards with his/her personal information, contact information, product manufacturing information and other required information. Administrator is responsible to manage quantity of products available in inventory, deals and discounts associated with specific products, manage marketing of products on regional basis, manage pricing of the products, advertising of products to make them as sponsored products to increase sales, etc. A Customer will be a General User who logged into the E-Commerce System. A customer need to login to the Proposed System with help of login credentials such as username and password, along with Name, Address, Phone No., E-Mail, Credit Card Details. A customer can search products with filters as per need, purchase products, add products to shopping cart, track order shipment online, track record of invoice, write reviews to purchased products. A General User can log into the system to make use of more features and functionalities of the system. A User can be Customer after logging into the system that not only can add products to the shopping cart but also can purchase the products. The proposed system will verify and authorize the person by verifying the Phone No and E-mail ID. Also it will verify the Credit Card Details if provided by the user. All the details of each individual customer will be stored in database with unique ID. User Authentication allows user to search and purchase the products available in inventory with provided address, date of delivery, shipping type, etc. Also a shopping cart is available for each individual customer to add products from

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 212

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

inventories which are to be purchased. Chatbot will use the search and purchase history if the user is authenticated by the system for product suggestions and recommendations. A customer can modify his/her profile or account created into the Proposed System. Before updating the profile the user need to authenticate that he/she is the original user of the account or profile by providing login credentials such as username and password. After this the customer need to provide the attributes need to be updated such address, phone no, mail ID, credit card details, etc. A Customer can interact with chatbot to make a purchase decision. Chatbot will interact with a customer based on customer‘s browse or search history, purchase history. Chatbot will make use of customer‘s record to suggest products from inventory to customer to purchase. Initially Chatbots will interact with basic set of rules designed with Machine Learning Algorithms. With more interaction with customer, chatbots will also get improved to recommend products to customers based on customer‘s search and purchase history. E-Commerce Website‘s Home Page is designed with the important features such as deals of the specific products, best seller products, discounts of products, various categories of the products to choose, feature to login to the system and a way to interact with chatbot. A user can browse through these categories to view various products such as clothes, accessories, beauty products, shoes, bags, etc. Inventory is a collection of all categories of the products. Administrator is allowed to add products to the inventory separated based on its category. For Example, Clothes is category of products such as shirts, tops, t-shirts, jeans, skirts, party wear dresses, etc. Similarly various products of different category can be added to the inventory by the Administrator person. Administrator need ISSN:0975-887

to login to the system with the help of login credentials such as username and password before managing inventory. Administrator can add new products with new or existing categories along with its description and images, add number of products to already listed products available in inventory. A customer who logged into the system can search and add these products to the Shopping Cart. Also customer can purchase the products in inventory or the products added to the shopping cart from inventory. A Shopping Cart is the temporary storage to save the products which a customer may want to purchase in future. The Shopping Cart is separate storage for individual customer who logged into the Proposed System. The products added to the shopping cart can be purchased by the customer. To purchase the product, the customer need to provide related information such as, Name, Address, Phone No., Date of Delivery, Shipping Type, Payment Method and Credit Card Details in case of Card Payment. A customer can modify the shopping cart items such as customer can either purchase the products in shopping cart or customer can remove the products from the shopping cart. Purchase history is recorded in form of invoice reports, order reports and transaction reports. Invoice is generated after the customer purchases the products from inventory which includes all the details of purchase and transactions made by the customer. It includes the details of product purchased such as price, quantity, product ID, product category along with customer details such as name, address of delivery, shipping type, date of delivery, phone no and payment method. All the details regarding purchase history are used by chatbot to interact with the customer based on his/her history to suggest or recommend products. A customer can track the shipment of order based on invoices recorded or transactions saved to

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 213

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

his/her profile. The online tracking of order can help to customer to locate his/her product. Sales and Marketing involves the techniques to suggest a customer to purchase particular products with advertisement. It is done based on the keywords searched by the customers for the products he/she wants to purchase. Advertising of products is a way of Marketing to increase Sales of products. All these things are managed by the Administrator to make maximum sales of products with the help of marketing of products. 4. PERFORMANCE ANALYSIS The performance of the proposed system can be analyzed based on few parameters. These parameters can be used to measure the performance of the system in comparison with the existing system. The parameters can be used for this analysis are,

Human Machine Interaction To provide better interaction between Human and Machine, the concepts of AI such as, Artificial Neural Network (ANN), Natural Language Processing (NLP) and Machine Learning Algorithms are used in the Proposed System. Human users will interact to the system through the Chatbot which a Software program designed to communicate with the user. These Machine Learning Algorithms will help Chatbot to generate response using Supervised and Unsupervised algorithms. This will enhance the performance of the proposed system as compared to existing system as fixed programs are used for Chatbot in existing system.  Better Recommendations to User Recommendations are the suggestions provided to the user based on the search or browse history and purchase history of the particular user. These recommendations provided by Chatbot can be in the form of Product recommendations with links, updates for latest products. To facilitate the customer service and support recommendations will ISSN:0975-887

play more important role by providing personalized suggestions. This will help customers to make a purchase decision which will increase profitable sales by personalized marketing. It will improve the performance of the proposed system due to personal recommendations.  Use of AI Concepts AI concepts such as Artificial Neural Networks (ANN), Natural Language Processing (NLP), and Machine Learning (ML) Algorithms are used in the proposed system. Machine learning algorithms such as Supervised and Unsupervised Algorithms will improve the performance of the proposed system. Linear Regression are capable for prediction Modelling and minimizing risk of failure. Linear Regression are used for predicting responses with better accuracy. It makes use of relationship between input values and output values. Naïve Bayes Algorithm is used for large data set for ranking or indexing purpose. It will help to rank the products based on customer reviews. Semisupervised algorithms will help to handle the combination of both labelled and unlabelled data. NLP is useful to understand the human understandable language by machine and generate the response in human language. It will make use of elements of Named Entity Recognition, Speech Recognition, Sentiment analysis and OCR. All these concepts will help to enhance the performance of the proposed system.

5. CONCLUSION The Internet has become a major resource in modern business, thus electronic shopping has gained significance not only from the entrepreneur‘s but also from the customer‘s point of view. For the entrepreneur, electronic shopping generates new business opportunities and for the customer, it makes comparative shopping possible. As per a survey, most consumers of online stores are impulsive and usually make a decision to stay on a site within the first few seconds. Hence we have designed the project

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 214

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

to provide the user with easy navigation, retrieval of data and necessary feedback as much as possible. As we have seen in this project, the process of creating a user-friendly and straightforward platform that facilitates the administrator‘s job is one filled with complexity. From understanding user requirements to system design and finally system prototype and finalization, every step requires in-depth understanding and commitment towards achieving the objective of project. So this is an efficient and effective way for the customers to purchase products online with the help of Chatbot within a few steps. . With the help of E-commerce website sellers can reach to larger audience and with the help of chatbots, sales can be increased by personal interaction with the users. In this way, this application provides an optimized solution with better availability, maintainability and usability.

REFERENCES [1] Adhitya Bhawiyuga, M. Ali Fauzi, Eko Sakti Pramukantoro, Widhi Yahya, ―Design of ECommerce Chat Robot for Automatically Answering Customer Question‖, University of Brawijaya Malang, Republic of Indonesia, 2017 [2] Anwesh Marwade, Nakul Kumar, Shubham Mundada, and Jagannath Aghav have published a paper ―Augmenting E-Commerce Product Recommendations by Analyzing Customer Personality‖, 2017 [3] Bayu Setiaji, Ferry Wahyu Wibowo, ―Chatbot Using A Knowledge in Database‖, 2017 [4] Abdul-Kader, S. A., & Woods, J., ―Survey on chatbot design techniques in speech conversation systems‖, International J. Adv. Computer Science Application, 2015 [5] Godson Michael D‘silva, Sanket Thakare, Sharddha More, and Jeril Kuriakose, ―Real

ISSN:0975-887

World Smart Chatbot for Customer Care using a Software as a Service (SaaS) Architecture‖, 2017 [6] S. J. du Preez1, M. Lall, S. Sinha, ―An Intelligent Web-Based Voice Chat Bot‖ , 2009 [7] Cyril Joe Baby, Faizan Ayyub Khan, Swathi J. N.,‖ Home Automation using IoT and a Chatbot using Natural Language Processing‖, 2017 [8] Ellis Pratt, ―Artificial Intelligence and Chatbots in Technical Communication‖, 2017 [9] Bayan Abu Shawar, Arab Open University, Information Technology Department, Jordan, ―Integrating Computer Assisted Learning Language Systems with Chatbots as Conversational Partners‖, 2017 [10] Aditya Deshpande, Alisha Shahane , Darshana Gadre, Mrunmayi Deshpande, Prof. Dr. Prachi M. Joshi, International Journal of Computer Engineering and Applications, Volume XI, ―A Survey Of Various Chatbot Implementation Techniques‖, May 2017 [11] Sameera A. Abdul-Kader, Dr. John Woods, International Journal of Advanced Computer Science and Applications (IJACSA), Vol. 6, ―Survey on Chatbot Design Techniques in Speech Conversation Systems‖, 2015 [12] M. J. Pereira, and L. Coheur, ―Just. Chat-a platform for processing information to be used in chatbots,‖, 2013. [13] A. S. Lokman, and J. M. Zain, American Journal of Applied Sciences, vol. 7, ―OneMatch and All-Match Categories for Keywords Matching in Chatbot,‖, 2010. [14] S. Ghose and J. J. Barua, Proc. IEEE of 2013 International Conference on Informatics, Electronics & Vision (ICIEV), ―Toward The Implementation of A Topic Specific Dialogue Based Natural Language Chatbot As An Undergraduate Advisor,‖ , 2013. [15] R. Kar, and R. Haldar, "Applying Chatbots to the Internet of Things: Opportunities and Architectural Elements". [16] McTear, Michael, Zoraida Callejas, and David Griol, Springer International Publishing, "Creating a Conversational Interface Using Chatbot Technology.", 2016.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 215

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

DISTRIBUTED STORAGE, ANALYSIS, AND EXPLORATION OF MULTIDIMENSIONAL PHENOMENA WITH TRIDENT FRAMEWORK Nikesh Mhaske1, Dr Prashant Dhotre2 1,2

Department of Computer Engineering, Dr.D.Y.Patil Institute of Technology,Pune, India [email protected], [email protected]

ABSTRACT Today‘s rising storage and computational capacities have led to the accumulation of very lengthy and detailed datasets. These datasets contain an accurate and deep understanding that describe natural phenomena, usage patterns, trends, and other aspects of complex, real-world systems. Machine learning models and Statistical are often employed to identify these patterns or attributes of interest. However, a wide array of potentially relevant models and defining or choosing parameters exist, and may provide the best performance only after preprocessing steps have been carried out. TRIDENT is an integrated framework that targets both how and where training data is stored in the system. Data partitioning can be configured using multiple strategies, including hash-based and spatially-aware partitioners. The default partitioner performs correlation analysis between independent and dependent variables to achieve dimensionality reduction. Reduced-dimensionality feature vectors are then clustered and dispersed to storage nodes that hold similar data. Clustering data points with high similarity enables to create the specialized models that outperform models generated with randomly-placed data. Trident supports three key aspects of handling data in the context of analytic modeling: (1) distribution and storage, (2) feature space management, and (3) support for ad hoc retrieval and exploration of model training data. Keywords— Distributed analytics, very lengthy data management, machine learnin 1. INTRODUCTION storage, (2) feature space management, Recent advancements in distributed and (3) support for ad hoc retrieval and storage and computation engines have exploration of model training data. In this enabled analytics at a never done scale, incoming feature vectors are partitioned to with systems such as Spark and Hadoop facilitate targeted analysis over specific allowing users to build distributed subsets of the feature space. applications to gain insight from very Transformations supported by TRIDENT lengthy and detailed, multidimensional include normalization, binning, and datasets. While these systems are highly support for dimensionality reduction based effective from a computational standpoint, on correlation analysis. Retrieval and both exploration and feature engineering Exploration of model training data is for machine learning models require enabled by expressive queries that can several rounds of computation and incur prune the feature space, sample across I/O costs as data is migrated into main feature vectors, or combine portions of the memory. To address these use cases we data. Exposing this functionality at the propose TRIDENT, which supports three storage level allows many steps in the key aspects of handling data in the context feature engineering process to be of analytic modeling: (1) distribution and performed before analysis begins. By use ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 216

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

to maximum advantage this functionality, researchers and practitioners can explore and inspect their datasets in an interactive fashion to help guide the creation of machine learning models or visualizations without needing to write ad-hoc applications or wait for heavyweight distributed computations to execute. 2. LITERATURE SURVEY

Tensor Flow [1] could be a machine learning system that operates at large scale and in heterogeneous environments. Tensor- Flow uses dataflow graphs to represent computation, shared state, and also the operations that change that state. It maps the nodes of a dataflow graph across several machines in a cluster, and at intervals a machine across multiple process devices, together with multicore CPUs, general purpose GPUs, and customdesigned ASICs called Tensor process Units (TPUs). This design provides flexibility to the applying developer: whereas in previous ―parameter server‖ styles the management of shared state is constructed into the system, Tensor Flow permits developers to experiment with novel optimizations and coaching algorithms. Tensor Flow supports a range of applications, with a spotlight on coaching and illation on deep neural networks. Several Google services use Tensor Flow in production, we have free it as associate ASCII text file project, and it has become wide used for machine learning analysis. Here, we have a tendency to describe the Tensor Flow dataflow model and demonstrate the compelling performance that Tensor Flow achieves for many realworld applications. Resilient Distributed Datasets (RDDs) [2], a distributed memory abstraction that lets programmers perform in-memory computations on massive clusters during a fault-tolerant manner. RDDs ar motivated by 2 sorts of applications that current computing frameworks handle inefficiently: unvarying algorithms and ISSN:0975-887

interactive information mining tools. In each case, keeping information in memory can improve performance by associate order of magnitude. To achieve fault tolerance expeditiously, RDDs offer a restricted variety of shared memory, supported coarse-grained transformations instead of fine-grained updates to shared state. However, we have a tendency to show that RDDs are communicatory enough to capture a large category of computations, including recent specialized programming models for unvarying jobs, like Pregel, and new applications that these models don't capture. We have enforced RDDs during a system referred to as Spark that we have a tendency to judge through a spread of user applications and benchmarks. Distributed Storage System [3] provides big table may be a distributed storage system for managing structured knowledge that's designed to scale to a awfully giant size: petabytes of information across thousands of goods servers. Several come at Google store knowledge in Big table, including net assortment, Google Earth, and Google Finance. These applications place terribly completely different demands on Big table, each in terms of information size (from URLs to web pages to satellite imagery) and latency necessities (from backend bulk process to period of time knowledge serving). Despite these varied demands, Big table has with success provided a versatile, superior resolution for all of these Google product. During this paper we have a tendency to describe the easy data model provided by Big table, which supplies shoppers dynamic management over knowledge layout and format, and that we describe the design and implementation of Big table. Decentralized Structured Storage System [4] may be a distributed storage system for managing terribly large amounts of structured knowledge opened up across several commodity servers, whereas providing extremely offered service with

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 217

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

no single purpose of failure. Cassandra aims to run on high of associate degree infrastructure of many nodes (possibly unfold across die rent knowledge centers). At this scale, tiny and enormous components fail incessantly. The method Cassandra manages the persistent state within the face of those failures drives the reliable ness and quantity friability of the computer code systems relying on this service. Where as in many ways Cassandra resembles a info and shares several style and implementation strategies with that, Cassandra doesn't support a full relative data model; instead, it provides purchasers with an easy data model that supports dynamic management over knowledge layout and format. Cassandra system was designed to run on cheap trade goods hardware and handle high write output while not searching scan efficiency. Cassandra may be a distributed storage system for managing terribly are amounts of structured knowledge opened up across several commodity servers, whereas providing extremely offered service with no single purpose of failure. Cassandra aims to run on high of associate degree infrastructure of many nodes (possibly unfold across direct knowledge centers). At this scale, tiny and enormous components fail incessantly. The method Cassandra manages the persistent state within the face of those failures drives the reliable hence and quantifiability of the computer code systems relying on this service. Whereas in many ways Cassandra resembles a info and shares several style and implementation strategies with that, Cassandra doesn't support a full relative data model; instead, it provides purchasers with an easy data model that supports dynamic management over knowledge layout and format. Cassandra system was designed to run on cheap trade goods hardware and handle high write output while not searching scan efficiency. Distributed Hash Tables [5] proliferation of observational devices and sensors with ISSN:0975-887

networking capabilities has led to growth in both the rates and sources of data that ultimately contribute to extreme- scale data volumes. Datasets generated in such settings are frequently multidimensional, with each dimension accounting for a feature of interest. We posit that efficient evaluation of queries over such datasets must account for both the distribution of data values and the patterns in the queries themselves. Configuring query evaluation by hand is impracticable given the data volumes, dimensionality, and the rates at which new data and queries arrive. Here, we describe our algorithm to autonomously improve query evaluations over voluminous, distributed datasets. Our approach independently tunes for the most dominant query patterns and distribution of values across a dimension. We evaluate algorithm in the context of our system, Galileo, which is a hierarchical distributed hash table used for managing geospatial, time-series data. Our system strikes a balance between fast evaluations, memory utilization and search space reductions. Empirical evaluations reported here are performed on the dataset that is multidimensional and comprises a billion files. The schemes described in work are broadly applicable to any system that leverages distributed hash tables as a storage mechanism. 3. PROPOSED METHODOLOGY

A key theme underpinning these core capabilities is the preservation of timeliness, allowing the analyst to quickly identify interesting data, gather insights, fit models, and assess their quality. To contrast with other approaches, consider a basic computational operation — retrieving the average (mean) of a particular feature. While straightforward in an algorithmic sense, this requires heavy disk and memory I/O in systems such as Hadoop or Spark, whereas in TRIDENT the operation can be completed in less than 1 ms by querying our indexing structure. Since the metadata collected by the system

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 218

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

is general and can be fused, filtering such a query based on time or additional feature values does not incur additional latency. TRIDENT is designed to assimilate data incrementally as it arrives, allowing both streaming and in-place datasets to be managed. The system employs a network design based on distributed hash tables (DHTs) to ensure scalability as new nodes are added to its resource pool, and uses a gossip protocol to keep nodes informed of the collective system state. This allows flexible preprocessing and creation of training data for statistical and machine learning models. Our methodology encompasses three core capabilities: 1) Data Dispersion: Effective dispersion of the dataset over a collection of nodes underpins data locality, representativeness of in-memory data structures, and the efficiency of query evaluations. The resulting data locality promotes timeliness during construction of specialized models for different portions of the feature space. 2) Feature Space Management: TRIDENT maintains memory-resident metadata to help locate portions of the dataset, summarize its attributes, and preprocess feature vectors. Online sketches ensure the data can be represented compactly and with high accuracy, while preprocessing activities enable operations such as dimensionality reduction or normalization. 3) Data Selection and Model Construction: TRIDENT supports interactive exploration via steering and calibration queries to probe the feature space. These real time queries help analysts sift and identify training data of interest. Training data can be exported to a variety of formats, including Data Frame implementations supported by R, Pandas, and Spark. TRIDENT also manages training and assessment of analytical models via generation of cross validation folds and bias-variance decomposition of model errors.

While we evaluate TRIDENT in the context of two representative datasets, our methodology does not preclude the use of data from other domains with similar dimensionality (hundreds to thousands of dimensions) where there is a need to ISSN:0975-887

understand outcomes.

phenomena

or

forecast

Fig. 1. TRIDENT architecture: multidimensional records are partitioned and indexed for subsequent analysis through expressive queries.

4. CONCLUSIONS

TRIDENT controls the placement of incoming feature vectors by reducing their dimensionality and clustering similar data points. Cluster quality is evaluated with the DaviesBouldin index, and we demonstrate improvements in building specialized local models across the nodes in the system. After partitioning, feature vectors are passed to online sketch instances and our memoryresident, hierarchical analytic base tree (ABT) data structures. This allows information to be retrieved about the underlying dataset and transformations to be applied without requiring disk I/O. Additionally, our analytic base trees support flexible queries to locate and refine portions of the feature space in memory. Online summary statistics also provide detailed information about the features under study without accessing files on disk, and preprocessing operations are cached to reduce duplicate transformations. Finally, our query-driven approach allows subsets of the feature space to be selected, creating training data sets that can be passed on to machine learning frameworks. To support such activities, we provide a base set of analytical models that can serve as pilot studies. Bias-variance decomposition of these models is also made available to allow the analyst to judge performance.

REFERENCES

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 219

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[1] M. Abadi, P. Barham, J. Chen et al., ―Tensor flow: A system for large-scale machine learning,‖ in Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI‘16. Berkeley, CA, USA: USENIX Association, 2016, pp. 265– 283. [Online]. Available:http://dl.acm.org/citation.cfm?id=30 26877.3026899 [2] M. Zaharia et al., ―Resilient distributed datasets: A fault tolerant abstraction for inmemory cluster computing,‖ in Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI‘12. Berkeley, CA, USA: USENIX Association, 2012, pp. 2–2. [Online]. Available: http: //dl.acm.org/citation.cfm?id=2228298.2228301 [3] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A.

ISSN:0975-887

Fikes, and R. E. Gruber, ―Bigtable: A distributed storage system for structured data,‖ ACM Trans. Comput. Syst., vol. 26, no. 2, pp. 4:1–4:26, Jun. 2008. [Online]. Available: http://doi.acm.org/10.1145/1365815.1365816 [4] A. Lakshman and P. Malik, ―Cassandra: A decentralized structured storage system,‖ SIGOPS Oper. Syst. Rev., vol. 44, no. 2, pp. 35–40, Apr. 2010. [5] M. Malensek, S. L. Pallickara, and S. Pallickara, ―Autonomously improving query evaluations over multidimensional data in distributed hash tables,‖ in Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference (CAC), Sep 2013, pp. 15:1–15:10. [Online]. Available: https://www.cs.usfca.edu/mmalensek/publicati ons/malensek2013autonomously.pdf

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 220

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

DATA MINING AND INFORMATION RETRIEVAL

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 221

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

UTILISING LOCATION BASED SOCIAL MEDIA FOR TARGET MARKETING IN TOURISM: BRINGING THE TWITTER DATA INTO PLAY Prof. G. S. Pise1, Sujit Bidawe2, Kshitij Naik3, Palash Bhanarkar4, Rushikesh Sawant5 1,2,3,4,5

Dept. of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT The way of growing body of literature has been devoted to harnessing the crowdsourcing power of social media by extracting ―hidden‖ knowledge from huge amounts of information or data available through online portals. The basic need to understand how social media affect the hospitality and tourism field has increased. In this paper the discussions and demonstrations focus on social media analytics using twitter data referring to "XYZ‖ travel. This paper also gives an idea on how social media data can be used indirectly and with minimal costs to extract travel attributes such as trip purpose and activity location. The results of this paper open up avenues for travel demand modellers to explore possibility of big data to model trips and also this study provides feasible marketing strategies to help growth of business as well as customer satisfaction. Categories and Subject Descriptors [Database Applications]: Data mining, Spatial databases and GIS [Online Information Services]: Web-based services KEYWORDS: Twitter, social media, big data analytics, location based social media 1. Tour purpose 1. INTRODUCTION Transport infrastructure is one of the most 2. Departure time important factors for a country's progress. 3. Mode of transport Although India has a large and diverse 4. Tour duration transport sector with its own share of 5. Tour location challenges, they can be overcome by 6. Travel route energy-efficient technologies and 7. Party organization customer-focused approach. 8. Traffic state It has been proven by so many instances how transport infrastructure has added Challenges for travel modeller are: speed and efficiency to a country's 1. Complexity progress. India, the seventh largest nation 2. Cost with over a billion population, has one of 3. Weather Condition the largest transport sectors. But not one 4. Traveller Anxiety without its own set of challenges. Travel Although focusing on increasing the demand modelling is widely applied for spectrum and quantity of data captured and analysis of major transportation analyzed will be a key trend, the quality of investments. Marketing is one hefty task in both the underlying data and the end the tourisms field since to know the product will arguably be of even greater interested customers, to find them out, to importance. Quality is the single greatest make them interested in your schemes issue and there is the potential not just to especially in a country like India with vast be inefficient in using an analytics population is as hard as it can get. program, but downright dangerous. Important attributes considered for travel Feeding large sums of irrelevant or simply modelling are: incorrect data into a data program has the ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 222

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

potential to push the right conclusions into the margins and promote erroneous findings. We can therefore expect travel brands to be examining closely at the quality of data both internally and externally and looking to corroborate their results with findings from alternative independent sources. Improving data governance, accessibility, and structure will also be crucial to driving forward the complexity of analytics. The travel industry is relatively well advanced in introducing analytics programs but we can anticipate organizations to increasingly move down the funnel towards predictive and prescriptive analytics. Being able to achieve more advanced analytics requires more sophisticated systems and a significant part of the budgetary increases will be spent on new tools to mine and interpret data. It was found that the most revenue managers felt they didn‘t have all the tools necessary to do their job, which demonstrates the depth of the market for further spending on tech. Extracting relevant information is not a challenge if only general information is used and but it is when hash-tag data or check-in data is in use. This study attempts to investigate how social media data can be used to ease and augment cross sales, target marketing, transportation planning, management and operation. This paper is structured as follows. First, the motivation forundertaking of this project is elaborated, then the literature is reviewed with a focus on the application of social media data in the field of tours and travels. Then a comprehensive framework is presented for using social media data in domain of travel sales. Next, proposed system is discussed, followed by a summary and future scope.

information is a challenging task, especially for attributes such as traveller and tourists finding. As a result, the accuracy of the outcome is not expected to be high unless advanced data mining and linguistic techniques are used. Nonetheless, the true potential of these techniques in extracting information from social media data is yet to be explored. The data obtained from various social media provides us vision into social networks and users that was not available before in both scale and extent. This social media data can surpass the real-world margins to study human interactions and help measure popular social and political sentiment associated with regional populations without time consuming explicit surveys. Social media effectively records viral marketing trends and is the ideal source to study, to better understand and leverage influence mechanisms. However, it is absolutely challenging to obtain relevant data from social media data without applying data mining techniques due to vivid obstacles. Data can now be stored in many different kinds of databases and information repositories. One data repository architecture that has emerged is the data warehouse, a repository of multiple heterogeneous data sources organized under a unified schema at a single site in order to facilitate management decision making. Data warehouse technology includes data cleaning, data integration, and on-line analytical processing (OLAP).

2. MOTIVATION Generally, the cost of obtaining such social media data is trivial. But processing such massive databases to extract travel ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 223

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. LITERATURE SURVEY Paper

Author

Publication

Effectiveness of Bayesian Rashidi, T., J. Auld, and A. Transportation Research Record: Journal of the Updating Attributes in Data Mohammadian Transportation Research Transferability Applications. Board, 2013 Effect of Variation in Rashidi, T., A. Mohammadian, Transportation Research Record: Journal of the Household and Y. Zhang Transportation Research Sociodemographics, Board, 2010 Lifestyles, and Built Environment on Travel Behavior. Urban Passenger Data Transportation Association of Transportation Association Collection: Keeping Up With Canada of Canada a Changing World [1] Rashidi, T., J. Auld, and A. Mohammadian, Effectiveness of Bayesian Updating Attributes in Data Transferability Applications. Transportation Research Record: Journal of the Transportation Research Board, 2013(2344): p. 1-9. The applications of the Bayesian updating formulation in the transportation and travel demand fields are continually growing. Improving the state of belief and knowledge about data by incorporating the existing prior information is one of the major properties of the Bayesian updating that makes this approach superior compared with other approaches to transferability. [2]Rashidi, T., A. Mohammadian, and Y. Zhang, Effect of Variation in Household Sociodemographics, Lifestyles, and Built Environment on Travel Behavior. Transportation Research Record: Journal of the Transportation Research Board, 2010(2156): p. 64-72.

ISSN:0975-887

Unlike traditional method of following normal(Gaussian) distribution; more than 40 different probability density functions were tested and validated on 11 clusters of homogeneous household types representing their lifestyles, 22 household and individual level travel attributes were considered. [3]Urban Passenger Data Collection: Keeping Up With a Changing World: Transportation Association of Canada https://www.tac-atc.ca/sites/tacatc.ca/files/site/doc/Bookstore/datacollection-primer.pdf This approach would provide an ―organic‖, voluntary, one- step-at-a-time approach to evolving a national data collection program that would be driven ―bottom up‖ by the provincial and municipal organisations. It would facilitate collaboration and the sharing of data and experience among provinces and their constituent urban areas across the nation. And it would encourage experimentation

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 224

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

through the spreading of possibly, pooling of funds.

risk

and,

4. PROPOSED WORK In the proposed system, we cover the limitation of previous methods of vague marketing in tourism and make targeting efficient with focused approach for data acquisition and marketing using Twitter data. With help of the Twitter data, we can find the tourists more efficiently. The use of dynamic Twitter data makes marketing optimum in terms of time and targeted approach. The system uses recently updated data for client search which makes it better. DATA FLOW DIAGRAM - LEVEL 0

Fig. Data Flow Diagram- Level 0

5. SUMMARY AND CONCLUSION This paper focuses on how Twitter data to be used in analysing the individual level travel behaviour of users. This framework helps for more applications of Twitter and other social media data for client search for travel industry, management, sales and operation purposes. With the help of Twitter posts it becomes easy to track the tourists around the needed location. It was found that tweets are mainly associated with the ease of the tourists and the ISSN:0975-887

facilities provided to them. This gives the usefulness of Twitter data for analysing the behaviour of tourists in cities. The data we obtain from the twitter is in huge amount at a time, which results broad insight at a time. The approach is more time efficient since we can target a highly scalable area as required in a single go. Twitter data provides various information about its users which is difficult to obtain otherwise. This social media data enables in efficient target marketing with variable parameters as per the need be. 6. FUTURE WORKS 1. In-home activity data: If the activity is scheduled to happen at home, one out-of-home activity is cancelled, which results in fewer trips on the transport network, which is of great importance to travel demand modellers and planners. 2. Tour formation: Tour formation requires collecting information about trips. Twitter users often provide information about their daily activities which helps to extract information about the location, time and purpose of different activities. Using Twitter data for modelling tour formation behaviour can significantly complement the models that are developed using household travel surveys. 3. Future activities: When the Twitter data is extracted using different techniques, it becomes possible to recognize potential future activities. In other words, based on his/her tweet about the place he/she wants to visit is likely to be at that location at a time to be determined. This helps to manage the future tours and their activities. 7. ACKNOWLEDGEMENTS With due respect and gratitude we would like to take this opportunity to thank our internal guide PROF. G. S. PISE for giving us all the help and guidance we needed. We are really grateful for his kind support. He has always encouraged us and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 225

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

given us the motivation to move ahead. He has put in a lot of time and effort in this project along with us and given us a lot of confidence. We are also grateful to DR. P. N. MAHALLE, Head of Computer Engineering Department, Smt. Kashibai Navale College of Engineering for his indispensable support. Also we wish to thank all the other people who have helped us in the successful completion of this project. We would also like to extend our sincere thanks to Principal DR. A. V. DESHPANDE, for his dynamic and valuable guidance throughout the project and providing the necessary facilities that helped us to complete our dissertation work. We would like to thank my colleagues friends who have helped us directly or indirectly to complete this work. REFERENCES [1] Rashidi, T., J. Auld, and A. Mohammadian, Effectiveness of Bayesian

ISSN:0975-887

[2]

[3]

[4]

[5]

Updating Attributes in Data Transferability Applications. Transportation Research Record: Journal of the Transportation Research Board, 2013(2344): p. 1-9. Rashidi, T., A. Mohammadian, and Y. Zhang, Effect of Variation in Household Sociodemographics, Lifestyles, and Built Environment on Travel Behavior. Transportation Research Record: Journal of the Transportation Research Board, 2010(2156): p. 64-72. Francis, R.C., et al., Object tracking and management system and method using radiofrequency identification tags. 2003, Google Patents. CRISP DM https://paginas.fe.up.pt/~ec/files_0405/slid es/02%20CRISP.pdf Urban Passenger Data Collection: Keeping Up With a Changing World, Transportation Association of Canada https://www.tac-atc.ca/sites/tacatc.ca/files/site/doc/Bookstore/datacollection-primer.pdf

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 226

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

CROSS MEDIA RETRIEVAL USING MIXEDGENERATIVE HASHING METHODS Saurav Kumar1,Shubham Jamkhola2, Mohd Uvais3, Paresh Khade4, Mrs Manjusha Joshi5 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgao(Bk), Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT Hash methods are useful for number of tasks and have attracted large attention in recent times. They proposed different approaches to capture the similarities between text and images. Most of the existing work uses bag-of-words method to represent text information. Since words with different format may have same meaning, the similarities of the semantic text cannot be well worked out in these methods. To overcome these challenges, a new method called Semantic Cross Media Hashing (SCMH) is proposed that uses the continuous representations of words which captures the semantic textual similarity level and uses a Deep Belief Network (DBN) to build the correlation between different modes. In this method we use Skip-gram algorithm for word embedding, SIFT descriptor to extract the key points from the images and MD5 algorithm for hash code generation. To demonstrate the effectiveness of the proposed method, it is necessary to consider three commonly used data sets that are considered basic. Here in proposed system we can used flicker dataset for experimental purpose. Experimental results show that the proposed method achieves significantly better results as well As the effectiveness of the proposed method is similar or superior to other hash methods. We can also remove drawback of flicker in this proposed system. Keywords- Deep Belief Network, Flicker, Semantic Cross-Media Hashing, descriptor to extract the key points from 1. INTRODUCTION Internet information has become much the images. The Fisher kernel structure is easy to view, search of text and images. used to incorporate both text information Therefore, the hash similarity based as well as image information with fixedcalculates or approximate search close by length vectors. To map Fisher vectors in next they have been proposed and received different ways, a network of deep beliefs is a remarkable beware of the last few years. used to carry out the task. In proposed Various applications use information to system remove drawback of flicker recover or detect near duplicate data and websites.MD5 algorithm for hash code data mining. At a different social generation. SCMH gives the best result networking sites, information entry than more advanced methods with through multiple channels we search any different lengths of hash code and displays images to that websites we got relevant as query results in order of classification or well as relevant and irrelevant data. In the mapping. After mapping find out the existing system a new hashing method that ranking of the images according to search. is the Semantic Cross-Media Hashing (SCMH) is used for detection of any 2. LITERATURE SURVEY duplicates and recovery of cross media. Given a collection of text-image bi- 1. Spatially Constrained Bag-of-Visualmodality data, we firstly represent image Words for Hyperspectral Image and text respectively. The cross-media Classification retrieval makes use of a skip gram Spatially obliged Bag-of-Visual Words algorithm for word embedding's to (BOV) proposed strategy for hyperspectral represent text information and the SIFT picture arrangement. We right off the bat ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 227

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

remove the surface element. The unearthly and surface highlights are utilized as two kinds of low-level highlights, in light of which, the abnormal state visual-words are built by the proposed technique. We utilize the entropy rate super pixel division strategy to fragment the hyperspectral into patches that well keep the homogeneousness of districts. The patches are viewed as records in BOV show. At that point kmeans bunching is executed to group pixels to develop codebook. At long last, the BOV portrayal is developed with the insights of the event of visual words for each fix. Trials on a genuine information demonstrate that the proposed strategy is tantamount to a few best in class strategies. 2. Automated Patent Classification Using Word Embedding Patent classification is the undertaking of dole out an uncommon code to a patent, where the allocated code is utilized to aggregate licenses with comparative subject into an equivalent class. This paper exhibits a patent arrangement technique dependent on word inserting and long momentary memory system to group licenses down to the subgroup IPC level. The trial results show that our classification technique accomplish 63% exactness at the subgroup level. 3. Deep visual-semantic alignments for generating image descriptions A model that creates regular dialect descriptions of images and their regions. Our approach leverages datasets of pictures and their sentence depictions to find out about the between modular correspondences among dialect and visual information. Our arrangement demonstrate depends on a novel blend of Convolutional Neural Networks over picture locales, bidirectional Recurrent Neural Networks over sentences, and an organized target that adjusts the two modalities through a multimodal inserting. We at that point depict a Multimodal Recurrent Neural Network design that utilizes the derived arrangements to figure out how to create ISSN:0975-887

novel portrayals of picture locales. We show that our arrangement display produces best in class results in recovery probes Flickr8K, Flickr30K and MSCOCO datasets. We at that point demonstrate that the created portrayals significant you perform retrieval baselines on both full images and on another dataset of locale level explanations. 4. Latent semantic sparse hashing for cross-modal similarity search A novel hashing technique, proposed to as Latent Semantic Sparse Hashing, for expansive scale cross modal comparability look among pictures and messages. Specifically, we uses Sparse Coding to catch abnormal state notable structures of pictures, and Matrix Factorization to separate inert ideas from writings. At that point these abnormal state semantic highlights are mapped to a joint reflection space. The pursuit execution can be advanced by blending numerous complete idle semantic portrayals from heterogeneous information. We propose an iterative procedure which is exceptionally efficient to investigate the connection between's multi-modular portrayals and scaffold the semantic hole between heterogeneous information in dormant semantic space. We lead broad investigations on three multi-modular datasets comprising of pictures and messages. Unrivaled and stable 423 exhibitions of LSSH verifies the effectiveness of it looked at against a few best in class cross-modular hashing techniques 5. Click through-based cross-view learning for image search We have explored the issue of specifically taking in the multi-see separate between a printed question what's more, a picture by utilizing both snap information and subspace learning methods. The snap information speaks to the snap relations among inquiries and pictures, while the subspace learning expects to take in an inert regular subspace between various

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 228

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

perspectives. We have proposed a novel navigate based cross-see figuring out how to take care of the issue in a guideline way. In particular, we utilize two diverse direct mappings to extend printed inquiries and visual pictures into an idle subspace. The mappings are found out by together limiting the separation of the watched question picture matches on the navigate bipartite chart and safeguarding the inborn structure in unique single view. In addition, we make symmetrical presumptions on the mapping frameworks. At that point the mappings can be gotten productively through curvilinear inquiry. We take l2 standard between the projections of inquiry and picture in the inactive subspace as the separation capacity to quantify the importance of a combine of (inquiry, picture). 6. Boosting cross-media retrieval via visual-auditory feature analysis and relevance feedback 0Diverse kinds of media information express abnormal state semantics from various angles. Step by step instructions to learn far reaching abnormal state semantics from various sorts of information and empower effective crossmedia recovery turns into a rising hot issue. There are rich relationships among heterogeneous low-level media content, which makes it trying to inquiry crossmedia information adequately. In this paper, we propose another cross-media recovery strategy dependent on present moment and long-haul significance input. Our technique for the most part centres around two run of the mill kinds of media information, i.e. picture and sound. Initially, we assemble multimodal portrayal by means of factual authoritative connection amongst picture and sound element frameworks, and characterize cross-media separate measurement for likeness measure; at that point we propose advancement technique dependent on importance input, which melds momentary ISSN:0975-887

learning results and long haul collected information into the goal work. Examinations on picture sound dataset have shown the prevalence of our strategy more than a few existing calculations. 3. EXISTING SYSTEM APPROACH Alongside the expanding necessities, lately, cross-media look errands have gotten extensive consideration. Since, every methodology having diverse portrayal strategies and correlational structures, an assortment of techniques examined the issue from the part of learning relationships between various modalities. The effectiveness of hashingbased strategies, there likewise exists a rich profession cantering the issue of mapping multi-modular high-dimensional information to low-dimensional hash codes, for example, Latent semantic inadequate hashing (LSSH), discriminative coupled word reference hashing (DCDH), Cross-see Hashing (CVH, etc. In the existing system, user can search the data on flicker, user can get result the relevant as well as irrelevant images.Irrelelent data is main drawback of existing system as well as in existing data search the images using normal text only so time require for searching is more. 4. PROPOSED SYSTEM APPROACH

Fig.1 Block Diagram of Proposed System

We propose a novel hashing strategy, called semantic cross-media hashing (SCMH), to play out the close copy recognition and cross media recovery assignment. We propose to utilize a lot of word embeddings to speak to printed data.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 229

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

In proposed framework comprise of 2 modules client and administrator. Administrator can include the pictures and different capacities additionally and client can look through the picture utilizing content and also picture moreover. In proposed system user can search image using text as well as image. Searching of image using text Word Embedding algorithm is used and for searching of image Feature Descriptor algorithm is used. Main drawback of existing system is in searching of flicker data relevant as well as irrelevant data is shown .So, we can remove this drawback In our system we can get relevant images only. Fisher portion system is joined to speak to both printed and visual data with settled length vectors. For mapping the Fisher vectors of various modalities, a profound conviction arrange is proposed to play out the undertaking. We assess the proposed technique SCMH on three normally utilized informational indexes. In a proposed system searching the data over hashing value for calculating hash value we can used MD5 algorithm. Proposed distinctive ways to deal with catch the similitudes among content and pictures. In this strategy we use Skip-gram calculation for word implanting, SIFT descriptor to remove the key focuses from the pictures and MD5 calculation for hash code age. SCMH accomplishes preferred outcomes over cutting edge strategies with various the lengths of hash codes. We can likewise evacuate downside of gleam in this proposed framework. In proposed system user can search the image using text as well as image. User can search any image using text, word embedding algorithm feature vector is calculated after calculating feature vector, mapping the image as well as ranking of images after ranking user can see the accurate image and user can search any image using image, Feature Descriptor algorithm feature vector is calculated after calculating feature vector, mapping the image as well as ranking of images after ISSN:0975-887

ranking user can see the proper image. Images also search using hash value using MD5 Algorithm. So, this is various type of methods are used for searching of images.Likewise positioning of pictures as per look. We can also display the ranking of image search by the user. 5. CONCLUTION In this work, we propose you a new hashing method, SCMH a duplicate and cross-media detection restoration activity. We are proposing to use a series of words to represent textual information. In this proposed system we can remove drawback of flicker. User can search the data using text as well as images. User can also search the images using hash values. The Fisher Framework Kernel built to represent both textual and visual information with fixed length vectors. To map the Fisher vectors of different modes, a network of deep beliefs intends to do the operation. We appreciate the proposal SCMH three common usage sets. SCMH best avant-garde methods with different the lengths of hash codes. In the MIR Flicker data set, SCMH related improvements in LSSH, which manages the best results in these data sets, are 10.0 and 18.5 percent text to Image & Image to Text tasks, respectively. Experimental results demonstrate effectiveness proposed in the cross-media recovery method. In the proposed user can also see ranked images according to users search. In feature work we can search anybody using images with other social media like facebook, twitter. 6. ACKNOWLEDGMENT This work is supported in a mix generative system of any state in india. Authors are thankful to Faculty of Engineering and Technology (FET), SavitribaiPhule Pune University,Pune for providing the facility to carry out the research work. REFERENCES [1] Liangrong Zhang, Kai Jiang, Yaoguo Zheng, Jinliang An, Yanning Hu, Licheng Jiao

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 230

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

―Spatially Constrained Bag-of-VisualWords for Hyperspectral Image Classification_‖ International Research Center for Intelligent Perception and Computation Xidian University, Xi‘an 710071, China_2016 [2] Mattyws F. Grawe, Claudia A. Martins, Andreia G. Bonfante, ―Automated Patent Classification Using Word Embedding ―16th IEEE International Conference on Machine Learning and Applications Federal University of Mato Grosso Cuiaba, Brazil 2017 [3] A. Karpathy and L. Fei-Fei, ―Deep visualsemantic alignments for generating image descriptions‖ in Proc. IEEE Conf. Comput.

ISSN:0975-887

Vis. Pattern Recog., Boston, MA, USA, Jun. 2015, pp. 31283137. [4] J. Zhou, G. Ding, and Y. Guo, ―Latent semantic sparse hashing for cross-modal similarity search,‖ in Proc. 37th Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2014, pp. 415–424. [5] Y. Pan, T. Yao, T. Mei, H. Li, C.-W. Ngo, and Y. Rui, ―Clickthrough-based cross-view learning for image search,‖ in Proc. 37th Int.ACMSIGIR Conf. Res. Develop. Inf. Retrieval, 2014, pp. 717–726. [6] H. Zhang, J. Yuan, X. Gao, and Z. Chen, ―Boosting cross-media retrieval via visualauditory feature analysis and relevance feedback,‖ in Proc. ACM Int. Conf. Multimedia, 2014, pp. 953–956.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 231

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

AN EFFICIENT ALGORITHM FOR MINING TOP-K HIGH UTILITY ITEMSET Ahishek Doke1, Akshay Bhosale2,Sanket Gaikwad3,Shubham Gundawar4 1,2,3,4

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon, Pune, India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT Data mining is a computerized process of searching for models in large data sets that involve methods at the intersection of the database system. The popular problem of data mining is the extraction of high utility element sets (HUI) or, more generally, the extraction of public services (UI). The problem of HUI (set of elements of high utility) is mainly the introduction to the set of frequent elements. Frequent pattern mining is a widespread problem in data mining, which involves searching for frequent patterns in transaction databases. Solve the problem of the set of high utility elements (HUI) with some particular data and the state of the art of the algorithms. To store the HUI (set of high utility elements) many popular algorithms have been proposed for this problem, such as "Apriori", FP growth, etc., but now the most popular TKO algorithms (extraction of utility element sets) K in one phase) and TKU (extraction of elements sets Top-K Utility) here TKO is Top K in one phase and TKU is Top K in utility. In this paper, address previous issues by proposing a new frame work for k upper HUI where k is the desired number of HUI to extract. Extraction of high utility element sets is an uncommon term. But we are using it while shopping online, etc. It is part of the business analysis. The main area of application is the analysis of the market basket, where when the customer buys the item he can buy another to maximize the benefit both the customer and supplier profit. Keyword Utility mining, high utility item set, top k Pattern mining, top k high item set mining. Top K models consists of two phases. In 1. INTRODUCTION Data mining is the efficient discovery of the first phase, called phase I, it is the valuable and vivid information from a vast complete set of high transaction weighted collection of data. Frequent set mining set utility item set (HTWUI). In the second (FIM) discovers the only frequent phase, called phase II, all HUIs are elements, but the set of HUI High Utility obtained by calculating the exact HTWUI items. In the FIM profile of the set of utilities with a database scan. Although elements are not considered. This is many studies have been devoted to the because the amount of the purchase does extraction of HUI, it is difficult for users to not take into account. Data mining is the effectively choose an appropriate process of analyzing data from different minimum threshold. Depending on the points of view and summarizing it in threshold, the size of the output can be useful data. Data mining is a tool for very small or very large. Also the choice analyzing data. It allows users to analyze of the threshold significantly impacts the data from different levels or angles, performance of the algorithms if the organize them and find the relationships threshold is too low then too many HUI between the data. Data mining is the will be presented to users then it will be process of finding patterns between difficult for users to understand the results. enough fields in the large relational A large amount of HUI creates data database. A classic algorithm based on mining algorithms unproductive or out of ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 232

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

memory, subsequently the more HUIs create the algorithms, the more resources they consume. Conversely, if the threshold is too high, HUI will not be found. 1.1 Background Frequently generate a huge set of HUIs and their mining performance is degraded consequently. Further in case of long transactions in dataset or low thresholds are set, then this condition may become worst. The huge number of HUIs forms a challenging problem to the mining performance since the more HUIs the algorithm generates, the higher processing time it consumes. Thus to overcome this challenges the efficient algorithms presented. Top k will not work on the parallel mining. 1.2 Motivation 1. Set the value of k which is more intuitive than setting the threshold because k represents the number of Item sets that users want to find whereas choosing the threshold depends primarily on database characteristics, which are often unknown to users. 2. The main point of min utility variable is not given in advance in top k HUI mining In traditional HUI mining the search space can be efficiently increased to algorithm by using a given the min utility threshold value. In scenario of TKO and TKU algorithm min utility threshold value is provided in advance. 1.3 Aim & Objective 1. The execution time of TKO algorithm is less but result is incorrect with a garbage value and it is efficient algorithm. The execution time of TKU algorithm is more but result is correct. It is very challenging issue how hybrid algorithm (TKO WITH TKU) is efficient than TKU algorithm. The time factor is very important in that. 2. Need to achieve significantly better performance. 3. The Hybrid Algorithm get HUI fixed Parameter of Rating and view and Number of Buy‘s 2. LITERATURE SURVEY ISSN:0975-887

1. ―Efficient tree structures for high-utility pattern mining in incremental databases‖. Recently, high utility pattern (HUP) mining is one of the most important research issues in data mining due to its ability to consider the non-binary frequency values of items in transactions and different profit values for every item. On the other hand, incremental and interactive data mining provide the ability to use previous data structures and mining results in order to reduce unnecessary calculations when a database is updated, or when the minimum threshold is changed. In this paper, we propose three novel tree structures to efficiently perform incremental and interactive HUP mining. The first tree structure, Incremental HUP Lexicographic Tree (IHUPL-Tree), is arranged according to an item‘s lexicographic order. It can capture the incremental data without any restructuring operation. The second tree structure is the IHUP Transaction Frequency Tree (IHUPTF-Tree), which obtains a compact size by arranging items according to their transaction frequency (descending order). To reduce the mining time, the third tree, IHUP-Transaction-Weighted Utilization Tree (IHUPTWU-Tree) is designed based on the TWU value of items in descending order. Extensive performance analyses show that our tree structures are very efficient and scalable for incremental and interactive HUP mining. 2. ―Mining high-utility item sets‖ Traditional association rule mining algorithms only generate a large number of highly frequent rules, but these rules do not provide useful answers for what the high utility rules are. We develop a novel idea of top-K objective-directed data mining, which focuses on mining the topK high utility closed patterns that directly support a given business objective. To association mining, we add the concept of utility to capture highly desirable statistical patterns and present a level-wise item-set mining algorithm. With both positive and negative utilities, the anti-monotone

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 233

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

pruning strategy in Apriori algorithm no longer holds. In response, we develop a new pruning strategy based on utilities that allow pruning of low utility item sets to be done by means of a weaker but antimonotonic condition. Our experimental results show that our algorithm does not require a user specified minimum utility and hence is effective in practice. 3. ―Mining top-k frequent closed patterns without minimum support‖ In this paper, we propose a new mining task: mining top-k frequent closed patterns of length no less than min_/spllscr/, where k is the desired number of frequent closed patterns to be mined, and min_/spllscr/ is the minimal length of each pattern. An efficient algorithm, called TFP, is developed for mining such patterns without minimum support. Two methods, closed-node-count and descendant-sum are proposed to effectively raise support threshold and prune FP-tree both during and after the construction of FP-tree. During the mining process, a novel topdown and bottom-up combined FP-tree mining strategy is developed to speed-up support-raising and closed frequent pattern discovering. In addition, a fast hash-based closed pattern verification scheme has been employed to check efficiently if a potential closed pattern is really closed. Our performance study shows that in most cases, TFP outperforms CLOSET and CHARM, two efficient frequent closed pattern mining algorithms, even when both are running with the best tuned minsupport. Furthermore, the method can be extended to generate association rules and to incorporate user-specified constraints. 4. ―Mining frequent patterns without candidate Generation‖ Mining frequent patterns in transaction databases, times Series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there existproli c patterns ISSN:0975-887

and/or long patterns. In this study, we propose a novel frequent pattern tree (FPtree) structure, which is an extended pre xtree structure for storing compressed, crucial information about frequent patterns, and develop an e client FP-tree based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Exigency of mining is achieved with three techniques: (1) a large database is compressed into a highly condensed, much smaller data structure, which avoids costly, repeated database scans, (2) our FP-treebased mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divideand-conquer method is used to decompose the mining task into a set of smaller tasks for mining con need patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is ancient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent pattern mining methods. 5. ―Novel Concise Representations of High Utility Item sets Using Generator Patterns‖ Mining High Utility Item sets (HUIs) is an important task with many applications. However, the set of HUIs can be very large, which makes HUI mining algorithms suffer from long execution times and huge memory consumption. To address this issue, concise representations of HUIs have been proposed. However, no concise representation of HUIs has been proposed based on the concept of generator despite that it provides several benefits in many applications. In this paper, we incorporate the concept of generator into HUI mining and devise two new concise representations of HUIs, called High Utility Generators (HUGs) and Generator of High Utility Item sets (GHUIs). Two efficient algorithms named

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 234

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

HUG-Miner and GHUI-Miner are proposed to respectively mine these representations. Experiments on both real and synthetic datasets show that proposed algorithms are very efficient and that these representations are up to 36 times smaller than the set of all HUIs. 6. ―Mining Top-K Sequential Rules‖ Mining sequential rules requires specifying parameters that are often difficult to set (the minimal confidence and minimal support). Depending on the choice of these parameters, current algorithms can become very slow and generate an extremely large amount of results or generate too few results, omitting valuable information. This is a serious problem because in practice users have limited resources for analyzing the results and thus are often only interested in discovering a certain amount of results, and fine-tuning the parameters can be very time-consuming. In this paper, we address this problem by proposing TopSeqRules, an efficient algorithm for mining the top-k sequential rules from sequence databases, where k is the number of sequential rules to be found and is set by the user. Experimental results on real-life datasets show that the algorithm has excellent performance and scalability. 7. ―Direct Discovery of High Utility Itemsets without Candidate Generation‖ Utility mining emerged recently to address the limitation of frequent itemset mining by introducing interestingness measures that reflect both the statistical significance and the user‘s expectation. Among utility mining problems, utility mining with the itemset share framework is a hard one as no anti-monotone property holds with the interestingness measure. The state-of-theart works on this problem all employ a two-phase, candidate generation approach, which suffers from the scalability issue due to the huge number of candidates. This paper proposes a high utility itemset growth approach that works in a single phase without generating candidates. Our basic approach is to enumerate itemsets by ISSN:0975-887

prefix extensions, to prune search space by utility upper bounding, and to maintain original utility information in the mining process by a novel data structure. Such a data structure enables us to compute a tight bound for powerful pruning and to directly identify high utility itemsets in an efficient and scalable way. We further enhance the efficiency significantly by introducing recursive irrelevant item filtering with sparse data, and a lookahead strategy with dense data. Extensive experiments on sparse and dense, synthetic and real data suggest that our algorithm outperforms the state-of-the-art algorithms over one order of magnitude. 8. ―Mining High Utility Itemsets in Big Data‖ In recent years, extensive studies have been conducted on high utility itemsets (HUI) mining with wide applications. However, most of them assume that data are stored in centralized databases with a single machine performing the mining tasks. Consequently, existing algorithms cannot be applied to the big data environments, where data are often distributed and too large to be dealt with by a single machine. To address this issue, we propose a new framework for mining high utility itemsets in big data. A novel algorithm named PHUI-Growth (Parallel mining High Utility Itemsets by pattern-Growth) is proposed for parallel mining HUIs on Hadoop platform, which inherits several nice properties of Hadoop, including easy deployment, fault recovery, low communication overheads and high scalability. Moreover, it adopts the MapReduce architecture to partition the whole mining tasks into smaller independent subtasks and uses Hadoop distributed file system to manage distributed data so that it allows to parallel discover HUIs from distributed data across multiple commodity computers in a reliable, fault tolerance manner. Experimental results on both synthetic and real datasets show that PHUI-Growth has high performance on large-scale datasets

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 235

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

and outperforms state-of-the-art nonparallel type of HUI mining algorithms. 9. ―Isolated items discarding strategy for discovering high utility item sets‖ Traditional methods of association rule mining consider the appearance of an item in a transaction, whether or not it is purchased, as a binary variable. However, customers may purchase more than one of the same item, and the unit cost may vary among items. Utility mining, a generalized form of the share mining model, attempts to overcome this problem. Since the Apriori pruning strategy cannot identify high utility item sets, developing an efficient algorithm is crucial for utility mining. This study proposes the Isolated Items Discarding Strategy (IIDS), which can be applied to any existing level-wise utility mining method to reduce candidates and to improve performance. The most efficient known models for share mining are ShFSM and DCG, which also work adequately for utility mining as well. By applying IIDS to ShFSM and DCG, the two methods FUM and DCG+ were implemented, respectively. For both synthetic and real datasets, experimental results reveal that the performance of FUM and DCG+ is more efficient than that of ShFSM and DCG, respectively. Therefore, IIDS is an effective strategy for utility mining. 10. ―ExMiner: An efficient algorithm for mining top-k frequent patterns‖ Conventional frequent pattern mining algorithms require users to specify some minimum support threshold. If that specified-value is large, users may lose interesting information. In contrast, a small minimum support threshold results in a huge set of frequent patterns that users may not be able to screen for useful knowledge. To solve this problem and make algorithms more user-friendly, an idea of mining the k-most interesting frequent patterns has been proposed. This idea is based upon an algorithm for mining frequent patterns without a minimum support threshold, but with a k number of ISSN:0975-887

highest frequency patterns. In this paper, we propose an explorative mining algorithm, called ExMiner, to mine k-most interesting (i.e. top-k) frequent patterns from large scale datasets effectively and efficiently. The ExMiner is then combined with the idea of ―build once mine anytime‖ to mine top-k frequent patterns sequentially. Experiments on both synthetic and real data show that our proposed methods are more efficient compared to the existing ones. 3. PROPOSED SYSTEM In the proposed framework, we address the problems mentioned above by proposing another system for calculating the means and means responsible for a high utility configured in parallel extraction using TKU and TKO. Two types of production calculations called TKU (extraction of sets of utility elements Top-K) and TKO (sets of themes of extraction Top-K are proposed in one phase) to extract these series of elements without the need to establish a utility minimum. But the TKO algorithm have the main disadvantage of not mainly accumulating the result of TKO given the value of the garbage in the set of high utility items isthe result of the TKU algorithm is increased but the execution time is high, so the alternative solution is to find the efficient algorithm in the proposed combination of the TKO and TKU algorithm system. It can be said that the result of TKO Top K in one phase is given at the entrance of TKU Top K in the utility result of TKO and TKU is increased and the execution time is low. In the proposed system, a new algorithm is generated for combining the name TKO and TKU as TKO WITH TKU or TKMHUI Top k Main set of utility elements. Modules: Module 1 - Administrator (Admin) The administrator preserve database of the transactions made by customers. In the daily market basis, each day a new product is let go, so that the administrator would

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 236

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

add the product or items, and update the new product view the stock details. Module 2 - User (Customer) Customer can purchase the number of items. All the purchased items history is stored in the transaction database. Module 3 - Construction of Up Tree In Up Tree Dynamic Table is generated by algorithms. Mainly the Up growth is considerable to get the PHUI item set.

End User Select Book Category

Up Growth Algorithms

K Value

TKO Algorithms

Select Category

TKU Algorithms

Module 4 TKO and TKU Algorithms In Combination of TKO and TKU algorithms first the TKO (Top k in one phase) algorithms is called and then output of TKO is given as the input of TKU (Top k in utility phases) then the actual result is TKU Result.

Data Base

Parallel and Pattern Algorithms

Result With K Value

Data Base

Fig 1: System Architecture

4. CONCLUSION In this paper, we looked at the question of the best sets of high-use mining mines, where k is the coveted number of highly useful sets of things to extract. The most competent combination of TKO WITH TKU of the TKO and TKU calculations is proposed to extract such sets of objects without establishing utility limits. Instead TKO is the first single phase algorithm developed for top-k HUI mining called PHUI (high potential set of utility elements) and PHUI is given to TKU in the utility phases. Empirical evaluations on different types of real and synthetic ISSN:0975-887

data sets display the proposed algorithms have good scalability in large data sets and the performance of the proposed algorithms are close to the optimal case of the state of the combination of both phases in an algorithm REFRENCES [1] IEEE TRANSACTIONS 2018 JANUARY 1, NO. 28, VOL. ENGINEERING, DATA AND KNOWLEDGE ON ―Efficient Algorithms for Mining Top-K High Utility Itemsets‖ Vincent S. Tseng, Senior Member, IEEE, Cheng-Wei Wu, Philippe Fournier-Viger, and Philip S. Yu, Fellow, IEEE. [2] C. Ahmed, S. Tanbeer, B. Jeong, and Y. Lee, ―Efficient tree structures for high-utility

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 237

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[3]

[4]

[5]

[6]

pattern mining in incremental databases,‖ IEEE Trans. Knowl. Data Eng., vol. 21, no. 12, pp. 1708–1721, Dec. 2009. R. Chan, Q. Yang, and Y. Shen, ―Mining high-utility itemsets,‖ in Proc. IEEE Int. Conf. Data Mining, 2003, pp. 19–26. J. Han, J. Wang, Y. Lu, and P. Tzvetkov, ―Mining top-k frequent closed patterns without minimum support,‖ in Proc. IEEE Int. Conf. Data Mining, 2002, pp. 211–218. J. Han, J. Pei, and Y. Yin, ―Mining frequent patterns without candidate generation,‖ in Proc. ACM SIGMOD Int. Conf. Manag. Data, 2000, pp. 1–12. P. Fournier-Viger, C. Wu, and V. S. Tseng, ―Novel concise representations of high utility itemsets using generator patterns,‖ in Proc. Int. Conf. Adv. Data Mining Appl. Lecture Notes Comput. Sci., 2014, vol. 8933, pp. 30– 43.

ISSN:0975-887

[7] P. Fournier-Viger and V. S. Tseng, ―Mining top-k sequential rules,‖ in Proc. Int. Conf. Adv. Data Mining Appl., 2011, pp. 180–194. [8] J. Liu, K. Wang, and B. Fung, ―Direct discovery of high utility itemsets without candidate generation,‖ in Proc. IEEE Int. Conf. Data Mining, 2012, pp. 984–989. [9] Y. Lin, C. Wu, and V. S. Tseng, ―Mining high utility itemsets in big data,‖ in Proc. Int. Conf. Pacific-Asia Conf. Knowl. Discovery Data Mining, 2015, pp. 649–661. [10] Y. Li, J. Yeh, and C. Chang, ―Isolated items discarding strategy for discovering high-utility itemsets,‖ Data Knowl. Eng., vol. 64, no. 1, pp. 198–217, 2008. [11] T. Quang, S. Oyanagi, and K. Yamazaki, ―ExMiner: An efficient algorithm for mining top-k frequent patterns,‖ in Proc. Int. Conf. Adv. Data Mining Appl., 2006, pp. 436 – 447.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 238

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SARCASM DETECTION USING TEXT FACTORIZATION ON REVIEWS Tejaswini Murudkar1, Vijaya Dabade2, Priyanka Lodhe3, Mayuri Patil4, Shailesh Patil5 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT The research area of sentiment analysis, opinion mining, sentiment mining and sentiment extraction has gained popularity in the last years. Online reviews are becoming very important criteria in measuring the quality of a business. This paper presents a sentiment analysis approach to business reviews classification using a large reviews dataset provided by Yelp: Yelp Challenge dataset. In this work, we propose several approaches for automatic sentiment classification, using two feature extraction methods and four machine learning models. It is illustrated a comparative study on the effectiveness of the ensemble methods for reviews sentiment classification. 1. INTRODUCTION 3. STATE OF ART Sentiment analysis has become an Mondher Bouazizi and Tomoaki Ohtsuki important research area for understanding [1] explained- use of Part-of-Speech-tags people‘s opinion on a matter by analyzing to extract patterns characterizing the level a large amount of information. The active of sarcasm of tweets training set since the feedback of the people is valuable not only number of patterns we extracted from the for companies to analyze their customers‘ current one is 346 541 Mondher Bouazizi satisfaction and the monitoring of and Tomoaki Ohtsuki [2] analyzed- ran competitors, but is also very useful for the classification using the classifiers consumers who want to research a product ―Random Forest‖, ―Support Vector or a service prior to making a purchase. Machine‖ (SVM), ―k Nearest Neighbors‖ (k-NN) and ―Maximum Entropy‖. Huaxun Deng, Linfeng Zhao et al. [3] analyzed 2. MOTIVATION used the similarity characteristics of the With the increased amount of data text to determine a set of true negative collection taking place as a result of social cases or fake reviews and extract the media interaction, scientific experiments, characteristic vector from multiple and even e-commerce applications, the aspects. Then, take the technique of Knature of data as we know it has been Means to cluster towards the comments. evolving. As a result of this data We label the comments as the negative generation from many different sources, case if the comments is close to the true ―new generation‖ data, presents negative cases, whereas label the challenges as it is not all relational and comments as the positive case if the lacks predefined structures. In this project comment far away from the trusted we try to sort these issues and provide a negative case Shalini Raghav, Ela Kumar way for better acquisition and processing [4] explained- have identified pattern of this type of data. We will be Analyzing extraction, hashtag based and contextual the real time social network data and try to approach. Tanya Jain, Nilesh Agrawal et eliminate the Fake reviews and analyze al. [5] analyzed- Problem of sarcasm the sarcasm in it. positive sentiments attached to a negative situation. The work uses two approachesvoted classifier and random forest ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 239

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

classifier. And in the proposed model they used seeding algorithm and pragmatic classifier to detect emoticon based sarcasm. Edwin Lunando, Ayu Purwarianti [6] analyzed- To solve the high computational overhead and low classification efficiency of the KNN algorithm, a text feature vector representation method based on information gain and non-negative matrix factorization is proposed. 4. GAP ANALYSIS The process of detecting sarcasm was done on the basis of fixed dataset. The dataset was saved and then the further processing started. The dataset was or could be manipulated easily as it used to be stored. Whereas in our process of detecting sarcasm, the detection is done on real time data. The data is not saved permanently. The minute you refresh the data would be refreshed from the memory, i.e. new data would be shown. The data would be only saved temporarily. Temporary storage would be done through MongoDB. As the data is not being saved, manipulation of the data is impossible. And the results are more accurate and unbiased. 5. PROPOSED WORK The increase in the data rates generated on the digital universe is escalating exponentially. With a view in employing current tools and technologies to analyze and store, a massive volume of data are not up to the mark, since they are unable to extract required sample data sets. Therefore, we must design an architectural platform for analyzing both remote access real time and offline data. When a business enterprise can pull-out all the useful information obtainable in the Big Data rather than a sample of its data set, in that case, it has an influential benefit over the market competitors. Big Data analytics helps us to gain insight and make better decisions. Therefore, with the intentions of using Big Data, modifications in ISSN:0975-887

paradigms are at utmost. To support our motivations, we have described some areas where Big Data can play an important role. In healthcare scenarios, medical practitioners gather massive volume of data about patients, medical history, medications, and other details. The above-mentioned data are accumulated in drug-manufacturing companies. The nature of these data is very complex, and sometimes the practitioners are unable to show a relationship with other information, which results in missing of important information. With a view in employing advance analytic techniques for organizing and extracting useful information from Big Data results in personalized medication, the advance Big Data analytic techniques give insight into hereditarily causes of the disease. In the Same way data is also generated for the reviews of the product across various services but sometimes we have to differentiate between fake reviews and Genuine Reviews for the input of our decision making process in Business. 6. CONCLUSION AND FUTURE WORK CONCLUSION Sarcasm is a complex phenomenon. In this project, we could notice how real time dataset can act as huge asset in terms of data gathering and how a few basic features like punctuation can be powerful in the detection (accuracy) of sophisticated language form like that of sarcasm. Data pre-processing and feature engineering would be one of the most important tasks in terms of improving accuracy and more in-depth analysis in these domains will help in improving the accuracy considerably. The goal of the system is to efficient detect sarcasm into positive, negative and neutral categories. Not just detecting the sarcasm but also detecting reviews in positive, negative and

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 240

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

neutral categories in the form of a graphical representation. FUTURE WORK In the above section I have mentioned few improvements that can be incorporated in the feature set that I have used. Apart from these we can include topic based feature set. Other major improvement that can be done is in data processing. Spell checks, word sense disambiguation, slang detection can make the data cleaner and can help in better classification. Also, the ratio of the sarcastic to non-sarcastic data is quiet high which is not the case in the real word hence we need to gather more data with lower ratio to get the real performance measure of our system. So basically, with detecting the sarcasm we also would be looking for foul languages and detecting them in the reviews the data wouldn‘t be saved and hence there is no scope of manipulation of the data. The results will be pretty unbiased. We wouldn‘t just limit it to the reviews, we‘ll be adding this process in comments as well.

[2] A Pattern-Based Approach for Sarcasm Detection on Twitter (2016), Mondher Bouazizi, Tomoaki Ohtsuki [3] Semi-supervised Learning based Fake Review Detection(2017), Huaxun Deng,Linfeng Zhao, Ning Luo, Yuan Liu, Guibing Guo, Xingwei Wang, Zhenhua Tan, Shuang Wang and Fucai Zhou [4] Review of Automatic Sarcasm Detection (2017 review paper), Shalini Raghav, Ela Kumar [5] Sarcasm Detection of Tweets: A comparative Study (2017), Tanya Jain,Nilesh Agrawal,Garima Goyal,Niyati Aggrawal. [6] Indonesian Social Media Sentiment Analysis with Sarcasm Detection (2013), Edwin Lunando, Ayu Purwarianti [7] Text Classification Algorithm Based on Nonnegative Matrix Factorization (2017),Yongxia Jing, Heping Gou, Chuanyi Fu,Qiang Liu [8] Satire Detection from Web Documents using machine Learning Methods(2014),Tanvir Ahmad, Halima Akhtar, Akshay Chopra, Mohd Waris [9] Automatic Sarcasm Detection using feature selection (2017), Paras Dharwal, Tanupriya Choudhury, Rajat Mittal, Paveen Kumar [10] Improvement Sarcasm Analysis using NLP and Corpus based Approach (2017), Manoj Y. Manohar, Prof. Pallavi Kulkarni [11] Sentiment Analysis for Sarcasm Detection on Streaming Short Text Data(2017), Anukarsh G Prasad, Sanjana S, Skanda M Bhat, B S Harish

REFERENCES [1] Sarcasm Detection in Twitter (2015), Mondher Bouazizi, Tomoaki Ohtsuki

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 241

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

PREDICTION ON HEALTH CARE BASED ON NEAR SEARCH BY KEYWORD Mantasha Shaikh1, Sourabh Gaikwad2, Pooja Garje3, Harshada Diwate4 1,2,3,4

Department of Computer Engineering, Smt.Kashibai navle college of Engineering, Vadgaon(Bk), Pune, India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT In our society, humans have more attention to their own fitness. Personalized fitness service is regularly rising. The lack of skilled doctors and physicians, maximum healthcare corporations cannot meet the clinical call for the public. Public want extra accurate and on the spot result. Thus, increasingly more facts mining packages are evolved to provide humans with extra custom designed healthcare provider. It is a good answer for the mismatch of insufficient clinical assets and growing medical demands. We advocate an AI-assisted prediction device which leverages information mining strategies to show the relationship between the everyday physical examination information and the capability fitness danger given through the consumer or public. The Main Concept to decide clinical illnesses in step with given signs and symptoms & every day Routine whilst User search the sanatorium then given the closest medical institution of their cutting-edge area. The machine gives a user-friendly interface for examinees and medical doctors. Examinees can recognize their symptoms which amassed in the frame which set as the at the same time as medical doctors can get fixed of examinees with capacity hazard. A comments mechanism could shop manpower and improve the overall performance of gadget mechanically. The doctor should restoration prediction result via an interface, which will accumulate medical doctors' enter as new training information. A more training technique might be caused every day the use of those facts. Thus, our machine ought to enhance the overall performance of the prediction model mechanically. Keyword: Data Mining, Machine Learning, and diseases prediction. records which takes the form of numbers, 1. INTRODUCTION Many healthcare companies (hospitals, text. There are lots of hidden records in medical facilities) in China are busy these data untouched. Data mining and serving humans with quality-attempt predictive analytics goal to reveal patterns healthcare carrier. Nowadays, humans pay and policies by applying advanced facts extra interest to their bodily situations. analysis strategies on a large set of facts They need higher first-class and more for descriptive and predictive purposes. customized healthcare provider. However, Data mining is suitable for processing with the limitation of a number of skilled large datasets from hospital records medical doctors and physicians, most machine and locating family members healthcare agencies cannot meet the need amongst facts features. It takes only some of the public. How to offer better first- researchers to investigate information from class healthcare to more people with sanatorium records. The Main Concept to restrained manpower turns into a key determine medical sicknesses consistent problem. The healthcare environment is with given signs & every day Routine usually perceived as being ‗facts wealthy' whilst User seek the health center then but ‗understanding bad'. Hospital facts given the closest health center in their structures usually generate a big amount of modern location. The machine provides a ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 242

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

person-pleasant interface for examinees and medical doctors. Examinees can know their signs and symptoms which accumulated in the frame which set as the while docs can get fixed of examinees with a capacity chance. A comments mechanism could shop manpower and enhance the performance of device robotically. 1.1 MOTIVATION a. Previous medical examiner only used basic symptoms of particular diseases but our application examiner examines the word count, laboratory results and diagnostic data. b. A feedback mechanism could save manpower and improve the performance of the system automatically. The doctor could fix prediction result through an interface, which will collect doctors' input as new training data. An extra training process will be triggered every day using these data. Thus, our system could improve the performance of the prediction model automatically. c. When the user visits hospital physically, then the user's personal record is saved and then that record is added to the examiner data set. It consumes a lot of time. 1.2 AIM AND OBJECTIVES a. The Main concept is to determine medical diseases according to given symptoms and daily routine and when user search the hospital, the hospital which is nearest to their current location is given. b. Determine medical diseases according to given symptoms & daily Routine. c. Prediction is done on the word count, laboratory results and diagnostic data.  2. RELATED WORK A. ―Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks‖ Author,-Srinivas K, Rani B K, Govrdhan A. The healthcare ISSN:0975-887

environment is generally perceived as being ‗information rich' yet ‗knowledge poor'. There is a wealth of data available within the healthcare systems. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. Knowledge discovery and data mining have found numerous applications in business and scientific domain. Valuable knowledge can be discovered from the application of data mining techniques in the healthcare system. In this study, we briefly examine the potential use of classification based data mining techniques such as rulebased, decision tree, naïve Bayes and artificial neural network to the massive volume of healthcare data. The healthcare industry collects huge amounts of healthcare data which, unfortunately, are not "mined" to discover hidden information. For data preprocessing and effective decision making One Dependency Augmented Naïve Bayes classifier (ODANB) and naive credal classifier 2 (NCC2) are used. This is an extension of naïve Bayes to imprecise probabilities that aims at delivering robust classifications also when dealing with small or incomplete data sets. Discovery of hidden patterns and relationships often goes unexploited. Using medical profiles such as age, sex, blood pressure, and blood sugar can predict the likelihood of patients getting heart disease. It enables significant knowledge, e.g. patterns, relationships between medical factors related to heart disease, to be established. Disadvantage For predicting heart attack significantly 15 attributes are listed Besides the 15 listed in the medical literature, we can also incorporate other data mining techniques, e.g., Time Series, Clustering and Association Rules. categorical data is used

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 243

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019



Era of Big Data‖ Author-Anderson J E, Chang D D. C. Many healthcare facilities enforce security on their electronic health records B. ―Grand challenges in clinical (EHRs) through a corrective mechanism: some staff nominally have almost decision support‖ Author- Sittig D, unrestricted access to the records, but there Wright A, Osheroff J, et al. is a strict ex post facto audit process for There is a pressing need for high-quality, inappropriate accesses, i.e., accesses that effective means of designing, developing, violate the facility‘s security and privacy presenting, implementing, evaluating, and policies. This process is inefficient, as maintaining all types of clinical decision support capabilities for clinicians, patients each suspicious access has to be reviewed by a security expert, and is purely and consumers. Using an iterative, consensus-building process we identified a retrospective, as it occurs after damage may have been incurred. This motivates rank-ordered list of the top 10 grand challenges in clinical decision support. automated approaches based on machine This list was created to educate and inspire learning using historical data. Previous attempts at such a system have researchers, developers, funders, and successfully applied supervised learning policy-makers. The list of challenges in models to this end, such as SVMs and order of importance that they be solved if logistic regression. While providing patients and organizations are to begin realizing the fullest benefits possible of benefits over manual auditing, these these systems consists of: improve the approaches ignore the identity of the users and patients involved in record access. human-computer interface; disseminate Therefore, they cannot exploit the fact that best practices in CDS design, development, and implementation; a patient whose record was previously involved in a violation has an increased summarize patient-level information; prioritize and filter recommendations to risk of being involved in a future violation. Motivated by this, in this paper, we the user; create an architecture for sharing propose a collaborative filtering inspired executable CDS modules and services; combine recommendations for patients approach to predicting inappropriate with co-morbidities; prioritize CDS accesses. Our solution integrates both content development and implementation; explicit and latent features for staff and patients, the latter acting as a personalized create internet-accessible clinical decision support repositories; use free text "fingerprint" based on historical access patterns. The proposed method, when information to drive clinical decision applied to real EHR access data from two support; mine large clinical databases to create new CDS. Identification of tertiary hospitals and a file-access dataset Amazon, shows not only solutions to these challenges is critical if from significantly improved performance clinical decision support is to achieve its compared to existing methods, but also potential and improve the quality, safety, provides insights as to what indicates and efficiency of healthcare Disadvantage:- Identification of solutions inappropriate access. ―Data Mining Techniques into to these challenges is critical if E. Telemedicine Systems‖ Authorclinical decision support is to achieve its Gheorghe M, Petre R providing care potential and improve the quality, safety, services through telemedicine has become and efficiency of healthcare an important part of the medical C. ―Using Electronic Health Records for development process, due to the latest Surgical Quality Improvement in the innovation in the information and Text mining is not used for of unstructured data.

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 244

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

computer technologies. Meanwhile, data mining, a dynamic and fast-expanding domain, has improved many fields of human life by offering the possibility of predicting future trends and helping with decision making, based on the patterns and trends discovered. The diversity of data and the multitude of data mining techniques provide various applications for data mining, including in the healthcare organization. Integrating data mining techniques into telemedicine systems would help improve the efficiency and effectiveness of the healthcare organizations activity, contributing to the development and refinement of the healthcare services offered as part of the medical development process. F. ―Query recommendation using query logs in search engines‖ Author-R. Baeza-Yates, C. Hurtado, and M. Mendoza In this paper we propose a method that, given a query submitted to a search engine, suggests a list of related queries. The related queries are based in previously issued queries and can be issued by the user to the search engine to tune or redirect the search process. The method proposed is based on a query clustering process in which groups of semantically similar queries are identified. The clustering process uses the content of historical preferences of users registered in the query log of the search engine. The method not only discovers the related queries but also ranks them according to a relevance criterion. Finally, we show with experiments over the query log of a search engine the effectiveness of the method. G. ―Data Mining Applications In Healthcare Sector: A Study ‖ Author -M. Durairaj, V. In this paper, our system have focused to compare a variety of techniques, approaches and different tools and its impact on the healthcare sector. The goal of data mining application is to ISSN:0975-887

turn that data are facts, numbers, or text which can be processed by a computer into knowledge or information. The main purpose of data mining application in healthcare systems is to develop an automated tool for identifying and disseminating relevant healthcare information. This paper aims to make a detailed study report of different types of data mining applications in the healthcare sector and to reduce the complexity of the study of the healthcare data transactions. Also presents a comparative study of different data mining applications, techniques and different methodologies applied for extracting knowledge from a database generated in the healthcare industry. Finally, the existing data mining techniques with data mining algorithms and its application tools which are more valuable for healthcare services are discussed in detail. H. ―Detecting Inappropriate Access to Electronic Health Records Using Collaborative Filtering‖ Author-Aditya Krishna Menon , Many healthcare facilities enforce security on their electronic health records (EHRs) through a corrective mechanism: some staff nominally have almost unrestricted access to the records, but there is a strict ex post facto audit process for inappropriate accesses, i.e., accesses that violate the facility's security and privacy policies. This process is inefficient, as each suspicious access has to be reviewed by a security expert, and is purely retrospective, as it occurs after damage may have been incurred. This motivates automated approaches based on machine learning using historical data. Previous attempts at such a system have successfully applied supervised learning models to this end, such as SVMs and logistic regression. While providing benefits over manual auditing, these approaches ignore the identity of the users and patients involved in record access. Therefore, they cannot

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 245

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

exploit the fact that a patient whose record was previously involved in a violation has an increased risk of being involved in a future violation. Motivated by this, in this paper, we propose a collaborative filtering inspired approach to predicting inappropriate accesses. Our solution integrates both explicit and latent features for staff and patients, the latter acting as a personalized "fingerprint" based on historical access patterns. The proposed method, when applied to real EHR access data from two tertiary hospitals and a fileaccess dataset from Amazon, shows not only significantly improved performance compared to existing methods, but also provides insights as to what indicates inappropriate access. I. ―Text data mining of aged care accreditation reports to identify risk factors in medication management in Australian residential aged care homes‖ Author-Tao Jiang & Siyu Qian, This study aimed to identify risk factors in medication management in Australian residential aged care (RAC) homes. Only 18 out of 3,607 RAC homes failed aged care accreditation standard in medication management between 7th March 2011 and 25th March 2015. Text data mining methods were used to analyze the reasons for failure. This led to the identification of 21 risk indicators for a RAC home to fail in medication management. These indicators were further grouped into ten themes. They are overall medication management, medication assessment, ordering, dispensing, storage, stock and disposal, administration, incident report, monitoring, staff, and resident satisfaction. The top three risk factors are: "ineffective monitoring process" (18 homes), "noncompliance with professional standards and guidelines" (15 homes), and "resident dissatisfaction with overall medication management" (10 homes).

ISSN:0975-887

J. ―Evaluation of radiological features for breast tumor classification in clinical screening with machine learning methods‖ Author-Tim W. Nattkempera, Bert Arnrich The kmeans clustering and self-organizing maps (SOM) are applied to analyze the signal structure in terms of visualization. We employ k-nearest neighbor classifiers (knn), support vector machines (SVM) and decision trees (DT) to classify features using computer-aided diagnosis (CAD) approach. K. ―Comparative Analysis of Logistic Regression and Artificial Neural Network for Computer-Aided Diagnosis of Breast Masses‖ AuthorSong J H, Venkatesh S S, Conant E A, Breast cancer is one of the most common cancers in women. Solography is now commonly used in combination with other modalities for imaging breasts. Although ultrasound can diagnose simple cysts in the breast with an accuracy of 96%–100%, its use for unequivocal differentiation between solid benign and malignant masses has proven to be more difficult. Despite considerable efforts toward improving imaging techniques, including sonography, the final confirmation of whether a solid breast lesion is malignant or benign is still made by biopsy. 3. EXISTING SYSTEM The system leverages data mining methods to reveal the relationship between regular physical examination records and potential health risk. It can predict examinees' risk of physical status next year based on the physical examination records this year. Examinees can know their potential health risks while doctors can get a set of examinees with potential risk. It is a good solution for the mismatch of insufficient medical resources and rising medical demands. They apply various supervised machine learning methods, including decision tree, XG Boost to predict potential health risks of examinees using their physical examination records.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 246

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Examinees can know their symptoms which accrued in the body which set as the (potential health risks according) while doctors can get a set of examinees with potential risk. 4.PROPOSED SYSTEM The Main Concept to determine medical diseases according to given symptoms & daily Routine when User search the hospital then given the nearest hospital of their current location. The system provides a user-friendly interface for examinees and doctors. Examinees can know their

Registration with particular Hospital

symptoms which accrued in the body which set as the while doctors can get a set of examinees with potential risk. A feedback mechanism could save manpower and improve the performance of the system automatically. The doctor could fix prediction result through an interface, which will collect doctors' input as new training data. An extra training process will be triggered every day using these data. Thus, our system could improve the performance of the prediction model automatically.

Symptoms given by User

Registration

view symptoms

Predication diseases

View Doctor

View User

Edit Hospitals

user Select Hospital

user Select Doctor

Doctor

Add Hospital with Specialization Admin After add Hospital

Add Hospital Admin

User Registration & Login

Search By Hospital Name User

Search Keyword

Search By Doctor Name

user Given Symptoms

Given appointment

Search By Specilization

Doctor Predication the diseases & Medicines

Fig 1: System Overview

Advantages are:  Increases human-computer interactions  Location of User is detected.  Recommended the hospital and doctor to  patient according to diseases Predicted.  Provided medicine for diseases which is predicted.  Fast Prediction system  Scalable, Low-cost  Comparable quality to experts. 5. CONCLUSION ISSN:0975-887

This project implements an AI-assisted prediction system which leverages data mining methods to reveal the relationship between the regular physical examination records and the potential health risk given by the user or public Different machine learning algorithms are applied to predict the physical status of examinee will be in danger of physical deterioration next year. In our System user or patient search the hospital, then results are given according to the nearest location of the current location of user/patients. User / Patients gives symptoms and the system will predict the diseases and will give the

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 247

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

medicines. We also design a feedback mechanism for doctors to fix classification result or input new training data, and the system will automatically rerun the training process to improve performance every day. REFERENCES [1] Zhaoqian Lan, Guopeng Zhou, Yichun Duan, Wei Yan, "AI-assisted Prediction on Potential Health Risks with Regular Physical Examination Records", IEEE Transactions On Knowledge And Data Science, 2018. [2] Srinivas K, Rani B K, Govrdhan A. ―Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks‖. International Journal on Computer Science & Engineering, 2010. [3] Sittig D, Wright A, Osheroff J, et al. ―Grand challenges in clinical decision support‖. Journal of Biomedical Informatics, 2008. [4] Anderson J E, Chang D C. ―Using Electronic Health Records for Surgical Quality Improvement in the Era of Big Data‖[J]. Jama Surgery, 2015. [5] Gheorghe M, Petre R. ―Integrating Data Mining Techniques into Telemedicine Systems‖ Informatica Economica Journal, 2014. [6] R. Baeza-Yates, C. Hurtado, and M. Mendoza, ―Query recommendation using query logs in search engines,‖ in Proc. Int. Conf. Current Trends Database Technol., 2004, pp. 588–596. [7] Koh H C, Tan G. Data mining applications in healthcare.[J]. Journal of Healthcare Information Management Jhim, 2005, 19(2):64-72. [8] Menon A K, Jiang X, Kim J, et al. Detecting Inappropriate Access to Electronic Health Records Using Collaborative Filtering[J].

ISSN:0975-887

Machine Learning, 2014, 95(1):87-101. [9] Accreditation Reports to Identify Risk Factors in Medication Management in Australian Residential Aged Care Homes[J]. Studies in Health Technology & Informatics, 2017, 245:892. [10] Nattkemper T W, Arnrich B, Lichte O, et al. Evaluation of radiological features for breast tumor classification in clinical screening with machine learning methods[J]. Artificial Intelligence in Medicine, 2005, 34(2):129139. [11] Song J H, Venkatesh S S, Conant E A, et al. Comparative analysis of logistic regression and artificial neural network for computeraided diagnosis of breast masses.[J]. Academic Radiology, 2005, 12(4):487-95. [12] V. Akg¨un, E. Erkut, and R. Batta. On finding dissimilar paths. European Journal of Operational Research, 121(2):232–246, 2000. [13] T. Akiba, T. Hayashi, N. Nori, Y. Iwata, and Y. Yoshida. Efficient topk shortest-path distance queries on large networks by pruned landmark labeling. In Proc. AAAI, pages 2–8, 2015. [14] A. Angel and N. Koudas. Efficient diversityaware search. In Proc. SIGMOD, pages 781– 792, 2011. H. Bast, D. Delling, A. V. Goldberg, M. M¨uller-Hannemann, T. Pajor,P. Sanders, D. Wagner, and R. F. Werneck. Route planning in transportation networks. In Algorithm Engineering, pages 19–80. 2016. [15] H. Bast, D. Delling, A. V. Goldberg, M. M¨uller-Hannemann, T. Pajor,P. Sanders, D. Wagner, and R. F. Werneck. Route planning in transportation networks. In Algorithm Engineering, pages 19–80. 2016. [16] Borodin, Allan, Lee, H. Chul, Ye, and Yuli. Max-sum diversification, monotone submodular functions and dynamic updates. Computer Science, pages 155–166, 2012.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 248

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

CRIME DETECTION AND PREDICTION SYSTEM Aparna Vijay Bhange1, Shreya Arish Bhuptani2, Manjushri Patilingale3, Yash Kothari4, Prof. D.T. Bodake5 1,2,3,4,5 Dept. of Computer Engineering Smt. Kashibai Navale College of Engineering, Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected] ABSTRACT Crime these days has become a problem of every nation. Around the globe many countries are trying to curb this problem. Preventive measures are taken to reduce the increasing number of cases of crime against women. A huge amount of data set is generated every year on the basis of reporting of crime. This data can prove very useful in analyzing and predicting crime and help us prevent the crime to some extent. Crime analysis is an area of vital importance in police department. Study of crime data can help us analyze crime pattern, inter-related clues & important hidden relations between the crimes. For prevention of crime, further using data mining technique, data can be predicted and visualized in various form in order to provide better understanding of crime patterns and prediction of crime becomes easier. General Terms KNN algorithm Keywords Crime, Classification, Detection and prediction, Knn. future trends in data based on similarity 1.INTRODUCTION measures. The crime rates are accelerating The objective of this work is to continuously and the crime patterns are predict whether the area the person is constantly changing. Crime is a violation against the humanity that is often accused travelling to is safe or not. Along with this crime capturing and women safety and punishable by the law. Criminology is a study of crime and it is interdisciplinary modules are added. For this purpose, we sciences that collects and investigate data have used K-means clustering and KNN on crime and crime performance. The classification techniques. We have crime activities have been increased now- illustrated how social development may lead to crime prevention. a-days and it is the responsibility of police department to control and reduce the crime activities [6]. According to National Crime 2. MOTIVATION Records Bureau, crime against women has Effecting conditions of the physical and significantly increased in recent years. It social environment that provide has become most prior to the opportunities for or predicate criminal administration to enforce law & order to acts. Reduce chances of crime. To help reduce this increasing rate of the crime local police stations in crime suppression. against women. So we need methodologies Nowadays crime against women has to predict and prevent crime. Data Mining increased tremendously. So, this work can provides clustering and classification be helpful to the needy woman. Primarily technique for this purpose. Clustering is the motive is to help common people live used for grouping the similar patterns. in a peaceful and better place. Classification is a technique of data analysis that is used to extract and predicts ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 249

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

3. STATE OF ART Name- Crime Pattern Detection Using Data Mining. Author - Shyam Varan Nath Description - Here we look at the use of clustering algorithm for a data mining approach to help detect the crimes patterns and speed up the process of solving crime. We will look at k-means clustering with some enhancements to aid in the process of identification of crime patterns. Name-Incorporating data sources and methodologies for crime data mining. Author -C Atzenbeck, A Celik, Z Erdem Description - This paper investigates sources of crime data mining, methodologies for knowledge discovery, by pointing out which forms knowledge discovery is suitable for which methodology. Name- Crime Prediction and Forecasting in TamilNadu using Clustering Approaches. Author-S. Sivaranjani, S.Sivakumari, Aasha.M Description-This paper uses KNN classification technique. The KNN classification searches through the dataset to find the similar or most similar instance when the input is given to it. Name- ―Efficient k-means clustering algorithm using ranking method in data mining‖ Author- Kaur N, Sahiwal JK, Kaur Navneet Description-This paper demonstrates the use of K-means clustering algorithm. It has explained the four steps of this clustering algorithm namely initialization, classification, centroid recalculation and convergence condition. Name- Criminals and crime hotspot detection using data mining algorithms: clustering and classification Author - Sukanya.M, T.Kalaikumaran and Dr.S.Karthik Description - To analyse the criminals data, clustering and classification techniques are used. These algorithms help ISSN:0975-887

to identify the hotspot of criminal activities. In this paper we find the hotspot of the criminal activities by using clustering and classification algorithms. The similar type of crime activities will be grouped together. Based on the clusters result, which cluster contains the more number of criminal activities that will be called as crime hotspot for the particular crime. Name- ABHAYA: AN ANDROID APP FOR THE SAFETY OF WOMEN Author - Ravi Sekhar Yarrabothu, Bramarambika Thota Description - This paper presents Abhaya, an Android Application for the Safety of Women and this app can be activated this app by a single click, whenever need arises. A single click on this app identifies the location of place through GPS and sends a message comprising this location URL to the registered contacts and also call on the first registered contact to help the one in dangerous situations. The unique feature of this application is to send the message to the registered contacts continuously for every five minutes until the ―stop‖ button in the application is clicked. Continuous location tracking information via SMS helps to find the location of the victim quickly and can be rescued safely. Name- Android Application for women security system Author - Kavita Sharma, Anand More Description - This paper describes a GPS and GSM based ―women security system‖ that provides the combination of GPS device as well as provide alerts and messages with an emergency button trigger. whenever some body is in trouble They might not have so much time, All that they have to do is pressing the volume key. Global Positioning System (GPS) technology to find out the location of women. The information of women position provided by the device can be viewed on Google maps using Internet or specialized software.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 250

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

By referring these papers, we have tried to develop this proposed system. Comparison between existing technologies and proposed system: In the existing module of women safety woman needs to click a button on the app and then help message will be sent to the emergency contact number. This message will be continuously sent till she presses stop button. In the proposed system woman can press the power button 3 to 4 times and then single help message is sent to her emergency contact number. The existing module of user in android application can see crime rate in the form of maps or graphs. The proposed system of user in android application can view crime status in the form of pie chart which is based on crime type. The module of crime capture is solely included in the proposed system. 4. GAP ANALYSIS Table: Gap Analysis

Manual Verificat Govt. DVS ion

Propo sed service s

Validity

Medium

Unlimi ted

High

Confidenti ality Cost of verificatio n

Moderat e

High

Mediu m

Medium

Mediu m

Low

Security

Moderat e

High

Mediu m

Energy Consumpt ion

High

High

Moder ate

5. PROPOSED SYSTEM The developed model will help to reduce crimes and will help the crime detection field in many ways that is in reducing crimes by carrying out various necessary measures. In this system there are three modules namely user module, woman ISSN:0975-887

safety module and crime capturing module. In first module that is in user module the person will come to know whether the place to which he is travelling to safe or not. This module is basically an android application where user can register himself. After registration whenever user will login, he will see three options which are view crime rate, crime capture and last one is logout. In first option he will be able to view the crime status of any area he wishes to from the available list. This will be displayed in graphical format. In second option that is crime capture which is second module also, if a user finds a crime happening in the surrounding then he can capture it and send it to the nearest police station from the available list so that police will be notified and they can take immediate necessary action. And last one is logout option. The third module is woman safety module. This is also an android application where the woman must be registered first. If a woman feels insecure then she can press the power button of her android mobile 4-5 times so that a notification can be sent to the emergency contact number which she has provided during the registration process. Along with the android application there will be a webpage which will be available for both user and admin. Police officers will act as admin. Admin can add and update data in the database area wise. 6. ALGORITHMS 1. K-means clustering We are using clustering technique of Data Mining. Here Clustering is used for grouping the similar patterns based on crime type [7]. K-means clustering is used here. K-means clustering is an unsupervised learning algorithm. Clustering will help us to display crime rate graphically using pie chart. The K-means algorithm can be executed in the following steps: 1) Specify the value of k that is the number of clusters.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 251

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

2) Randomly select k cluster centers. 3) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers. 4) Set the position of each cluster center to the mean of all data points belonging to that cluster. 5) Recalculate the distance between each data point and new obtained cluster centers. 6) If no data point was reassigned then stop, otherwise repeat from step 3). In our project for K-means algorithm there will be k clusters with k cluster centers and each center would represent a particular crime type. The data points will be the various types of crimes that have happened and the clustering would be done such that the similar crime type is grouped together in a cluster. This grouping of crime type will be displayed with the help of a pie chart which will help us to understand the rate of a particular crime in an area. 2. KNN classification Classification is a technique of data analysis which is used to extract and predict future trends in data based on similarity measures. KNN algorithm is used as a classification algorithm. Here, we are using KNN algorithm to get the list of nearest police stations. KNN Algorithm is based on feature similarity. An object is classified by a majority vote of its neighbors, with the object being assigned to the class most common among its k nearest neighbors. The algorithm can be explained as: 1) A positive integer k is specified, along with a new sample. 2) We select the k entries in our database which are closest to the new sample.

ISSN:0975-887

3) We find the most common classification of these entries. 4) This is the classification we give to the new sample. [10] In our project we have used KNN to find the list of nearest police station from the current location of user. This will help the user to select the police station which is nearest to him/her so that the police can also take action quickly by reaching the destination in time. 7. SYSTEM ARCHITECTURE In the system architecture, the flow is : 1) First the user will register/login into the android application. After logging in he/she can view crime status of a particular area he/she wants to see. At the back-end data will be processed from the database and will generate result. Along with the android application there will be a webpage which will be available for both user and admin. The crime data will be added and updated by police officers in the database. The police officer is the admin. 2) For the Woman Safety module, if any woman feels insecure she can press power button of her android mobile phone 3 to 4 times and after this a help message is generated and sent to her emergency contact which she has given during her registration on the android app. 3) For the Crime Capture module, if the user watches a crime happening in his/her surrounding then he/she can capture the crime scene using mobile and after that he/she will get the nearest police station list, he/she can select nearest police station and can send the photo to that station. With this photo the police will reach to that location and can do further procedure

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 252

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

.

Fig 1: System architecture

8. CONCLUSION We looked at the use of data mining techniques in crime prediction and detection. Crime detection is the dynamic and emerging research field in the real world which aims to prevent the crime rates. Data Mining plays an important role in law enforcement agencies in crime analysis in terms of crime detection and prevention. The developed work model will reduce crimes and will help the crime detection field in many ways that is reducing the crimes by carrying out various necessary measures. REFERENCES [1] J. Agarwal, R. Nagpal, and R. Sehgal, ―Crime analysis using k-means clustering,International Journal of Computer Applications, Vol. 83 – No4, December 2013. [2] J. Han, and M. Kamber, ―Data mining: concepts and techniques,‖ Jim Gray, Series Editor Morgan Kaufmann Publishers, August 2000. [3] P. Berkhin, ―Survey of clustering data mining techniques,‖ In: Accrue Software, 2003.

ISSN:0975-887

[4] W. Li, ―Modified k-means clustering algorithm,‖ IEEE Congress on Image and Signal Processing, pp. 616- 621, 2006. [5] Sukanya. M, T. KalaiKumaran, and Dr. S. Karthik -Criminal and Crime hotspot using data mining algorithms: clustering and classification. [6] S. Sivaranjani, Dr. S. Sivakumari, Aasha. M Crime prediction and forecasting in Tamilnadu using clustering approaches. [7] Kaur N, Sahiwal JK, ―Efficient k-means clustering algorithm using ranking method in data mining‖, International Journal of Advanced Research in Computer Engineering & Technology, vol. 1(3) pp. 85-91, 2012. [8] Ravi Sekhar Yarrabothu, Bramarambika Thota- ABHAYA: AN ANDROID APP FOR THE SAFETY OF WOMEN [9] Kavita Sharma, Anand More- Android Application for women security system. [10] https://medium.com/@adi.bronshtein/a-quickintroduction-to-k-nearest-neighborsalgorithm-62214cea29c7 [11] Shyam Varan Nath- Crime Pattern Detection Using Data Mining. [12] C Atzenbeck, A Celik, Z Erdem- Incorporating data sources and methodologies for crime data mining.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 253

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

ACADEMIC ASSESSMENT WITH AUTOMATED QUESTION GENERATION AND EVALUATION Kishore Das1, Ashish Kempwad2, Shraddha Dhumal3, Deepti Rana4, Prof. S.P. Kosbatwar5 1,2,3,4,5

Department of Computer Engineering, Smt. Kashhibai Navale College Of Engineering, Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT We have introduced an automated way which would permit the operation of generating exam paper to be further well organized and productive and it would also aid in developing a database of questions which could be further classified for blending of exam question paper. Currently, there is no systematic procedure to fortify quality of exam question paper. Hence, there appears a requirement to have a system which will automatically create the question paper from teacher entered description within few seconds. We have implemented a modern evolutionary path that is able to manage multi-constraints issue along with creating question papers for examinations in autonomous institutes from a very vast question bank database. The utilization of randomization algorithm in an Automatic Question Paper Generator System which has been implemented specially for autonomous institutes are described. The endeavour needed for generating question paper is diminished after the implementation of this advanced system and because of this advanced system there is no obligation for humans to ponder and employ time which can be utilized on some additional important duty instead of designing question paper. Keywords NLP, POS Tagging, Answer Evaluation, Random Question Generator, keyword extraction, Descriptive Answer verifier. 1. INTRODUCTION The examination committee in an institute works in a very conventional manner. This way it is time consuming and makes all instructors tired of doing these same activities frequently. Question paper generator is a special and unique software, which is used in school, universities and colleges. Test paper setters who want to have a huge database of questions for frequent generation of question can use this too. This software can be implemented in various medical, engineering and coaching institutes for theory paper. You can create random question papers with this software anytime within seconds. You can enter questions based on units, chapters and subjects depending upon the system storage, capacity and as per the requirement. For entering questions, you have to first specify the subject and you can enter unlimited questions in a unit. ISSN:0975-887

Examinations predominantly use question papers as a vital constituent to discover the caliber of students. A good exam gives all students an equal opportunity to fully demonstrate their learning. 2. SCOPE We are aiming to develop an automated question paper generator and evaluator. The system must minimize the human errors. The question paper is to be generated using automation so as to avoid repetition. The evaluation must be to replace the manual checking of answer sheets. This is to reduce biased correction. The system must be reliable and efficient. It will also save the human labour and time. 3. STATE OF ART Various modules like admin module, user module, and question entry and question management are mentioned. From the

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 254

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

entered input the paper is generated and saved as .pdf file[1]. Usage of Stanford Parser for parsing as well as a parts of speech tagger. Then adjectives are separated and relationship of the words will be determined[2]. Answers are converted into graphical forms to apply some of the similarity measures, WordNet and spreading process to calculate similarity score[3]. The systems developed to correct assignments primarily use shorttext matching, a similarity score[3], template matching[4], an answer validation system[5]. 4. SYSTEM DESIGN Architecture It consists of three tiers: User Interface, Business Logic and Database. The two main users are Faculty and Admin. Faculty will get access to Add and Evaluate module. Admin will have access to Evaluate and Generate module. The three modules: Add, Generate and Evaluate will manipulate the database.

Fig 1: System Architecture

DFD (Data Flow Diagram) There are three major modules: Add Questions, Question Generation and Evaluation of Papers.

ISSN:0975-887

Fig 2: Data Flow Diagram

In the first module faculty will add the questions which are stored in the database by the faculty itself. These questions can be generated randomly using the randomized algorithm. Questions will be stored in the database by the faculty and whenever the add questions module will be used questions will get added to generate the question papers. In the second module admin will generate the question papers. Admin will set some parameters like difficulty level of the questions. Questions can be generated unit wise. After setting all the required parameters question paper will be generated. In the third module faculty or admin both can verify the answer sheets of the students. Students write the answers on the sheets those answer sheets will be scanned by the admin or faculty using OCR and then evaluation of the answer sheets will take place. For evaluating subjective answers system will match the keywords of the student's answer with the standard answer to check the correctness of the answer. Also synonyms will be considered for keyword matching. For checking the grammar of the answers lexical analysis will be used. Add module Faculty will enter the question, subject, topic, difficulty level, keywords for answer evaluation etc. in the database.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 255

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

keyword matching is done to collectively evaluate the result. Mathematical model The score will be calculated as: P(QST, Keywords, Grammar) = P(QST) * P(Keywords) * P(Grammar) QST is Question Specific Term. Fig 3: Add module

Generate module Admin will select subject, difficulty level, marks distribution, no. of questions etc. and generate the required question paper. Questions can be selected by using either random algorithm or manual selection.

Fig 4: Generate module

Evaluate module Answer sheets are converted into text file using Image recognition and OCR.

Fig 5: Evaluate module

The text file is further analyzed using grammar analysis, Keyword extraction, Synonym replacement for keywords and

ISSN:0975-887

Here, there must be presence of Question Specific Terms, Keywords and Grammar collectively. Absolute absence of any one (say value 0) will result in the score to become 0. It can be understood like a simple multiplication with the number 0 which results in 0. 5. ADVANTAGES  Question can be selected using difficulty levels. 

Admin can use automated test paper generator module to save a lot of time.



Randomization algorithm selection of questions.



With the use of this system for exam paper generation there are zero chances of exam paper getting leaked as paper can be generated few minutes before the exam.



With this system fewer human efforts time and resources.



Unbiased evaluation of answer sheets.

for

LIMITATIONS  Problem of recognizing a wide variety of handwritten answer sheets.  Keyword matchings must support the usage of synonyms too.  Difficult to evaluate the diagrams in the answer sheet. APPLICATIONS  In schools, colleges, universities and other educational institutions with huge databases to generate question papers frequently.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 256

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

 

In various medical, engineering and coaching institutions for theory examinations. Students are the most important group of indirect users as they are the ones who are impartially being evaluated.

6. CONCLUSION The proposed work narrates an automated system that heads away from the traditional process of paper generation to an automated process, by giving controlled entry to the resources that is attained by involving users and their roles in the colleges. We have also considered the significance of randomization to avoid duplication of questions. Hence the resultant automated system for Question Paper Generation will yield enhancement of random creation of question papers and automated evaluation. FUTURE WORK  Addition of a module that would accept voice data from a microphone and correct the same without any human assistance.  Scanning for diagrams, figures and blocks.  Behavior prediction and vocabulary of the student can be

ISSN:0975-887



checked based on the writing style. A module can be constructed wherein it simulates all the answer sheets and displays the most ideal sheet and compares it with the original and shows similarity ratio.

REFERENCES [1] Prof. Mrunal Fatangare, Rushikesh Pangare, Shreyas Dorle, Uday Biradar, Kaustubh Kale, ―Android Based Exam Paper Generator‖, (IEEE 2018), pp. 881-884. [2] Prateek Pisat, Shrimangal Rewagad, Devansh Modi, Ganesh Sawant, Prof. Deepshikha Chaturvedi, ―Question Paper Generator and Answer Verifier‖, (IEEE 2017), pp. 10741077. [3] Amarjeet Kaur, M. Sasikumar, Shikha Nema, Sanjay Pawar, ―Algorithm for Automatic Evaluation of Single Sentence Descriptive Answer‖, 2013. [4] Tilani Gunawardena, Medhavi Lokuhetti, Nishara Pathirana, Roshan Ragel and Sampath Deegalla, ―An Automatic Answering System with Template Matching for Natural Language Questions‖, Faculty of Engineering, University of Peradeniya, Peradeniya 20400 Sri Lanka. [5] jAnne-Laure Ligozat, Brigette Grau, Anne Vilnat, Isabelle Robba, Arnaud Grappy, ―Towards an automatic validation of answers in Question Answering‖, 19th IEEE International Conference on Tools with Artificial Intelligence.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 257

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

A COMPREHENSIVE SURVEY FOR SENTIMENT ANALYSIS TECHNIQUES Amrut Sabale1, Abhishek Charan2, Tushar Thorat3, Pavan Deshmukh4 1,2,3,4

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT Analysis of public information from social media could yield interesting results and insights into the world of public opinions about almost any product, service or personality. Social network data is one of the most effective and accurate indicators of public sentiment. The explosion of Web 2.0, also called Participative and Social Web has led to increased activity in Pod-casting, Blogging, Tagging, Contributing to Social Book-marking, and Social Networking. As a result there has been an eruption of interest in people to mine these vast resources of data for opinions. Sentiment Analysis or Opinion Mining is the computational treatment of opinions, sentiments and subjectivity of text.The main idea behind this article is to bring out the process involved in sentiment analysis. In this paper we will be discussing about techniques which allows classification of sentiments. Index Terms—Sentiment analysis, sentiment classification, so-cial media. applications. Hence, sentiment analysis 1. INTRODUCTION Sentiment is an attitude, thought, or seems having a strong fundamental with judgment prompted by feeling. Sentiment the support of massive online data. analysis, which is also known as opinion Microblogging websites have evolved to mining, studies people‘s sentiments become a source of varied kind of towards certain entities. Internet is a information. This is due to nature of micro resourceful place with respect to sentiment blogs on which people post real time information. From a user‘s perspective, messages about their opinions on a variety people are able to post their own content of topics, discuss current issues, complain, through various social media, such as and express positive sentiment for forums, micro-blogs, or online social products they use in daily life. In fact, networking sites. From a researcher‘s companies manufacturing such products perspective, many social media sites have started to poll these micro blogs to release their application programming get a sense of general sentiment for their interfaces (APIs), prompting data product. Many time these companies study collection and analysis by researchers and user reactions and reply to users on developers. For instance, Twitter currently microblogs. One challenge is to build has three different versions of APIs technology to detect and summarize an available, namely the REST API, the overall sentiment. Search API, and the Streaming API. With The purpose of this survey is to the REST API, developers are able to investigate lexicon based, machine gather status data and user information; the learning techniques for different sentiment Search API allows developers to query analysis tasks. Sentiment analysis tasks are specific Twitter content, whereas the surveyed as subjectivity classification, Streaming API is able to collect Twitter sentiment classification. Therefore, articles content in real time. Moreover, developers writ-ten in last five years on sentiment can mix those APIs to create their own classification techniques of these tasks are ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 258

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

discussed in this study. Moreover, sentiment analysis approaches,applications of sentiment analysis and some general challenges in sentiment analysis are presented. 2. SENTIMENT ANALYSIS PPROACHES-A SURVEY Trupthi, Suresh Pabboju, G.Narsimha[2],The key features of this system are the training module which is done with the help Hadoop and MapReduce, Classification based on Naïve Bayes, Time Variant Analytics and the Continuous-learning System. The fact that the analysis is done real time is the major highlight of this paper. Juan Guevara, Joana Costa, Jorge Arroba, Catarina Silva[5],One of the most popular social networks for microblogging that has a great growth is Twitter, which allows people to express their opinions using short, simple sentences. These texts are generated daily, and for this reason it is common for people to want to know which are the trending top- ics and their drifts. In this paper we propose to deploy a mobile app that provides information focusing on areas, such as, Politics, Social, Tourism, and Marketing using a statistical lexicon approach. The application shows the polarity of each theme as positive, negative, or neutral. S. Rajalakshmi, S. Asha, N. Pazhaniraja[1],In this case, sentiment analysis or opining mining is useful for mining facts from those data. The text data obtained from the social network primarily undergoes emotion mining to examine the sentiment of the user message. Most of the sentiment or emotional mining uses machine learning approaches for better results.The principle idea be- hind this article is to bring out the process involved in sentiment analysis.Further the investigation is about the various methods or techniques existing for perform- ing sentiment analysis.It also presents the ISSN:0975-887

various tools used to demonstrate the process involved in sentiment analysis. Anuja P Jain , Asst. Prof Padma Dandannavar[3],The objective of this paper is to give step-by-step detail about the process of sentiment analysis on twitter data using machine learning. This paper also provides details of proposed approach for sen- timent analysis. This work proposes a Text analysis framework for twitter data using Apache spark and hence is more flexible, fast and scalable. Naive Bayes and Decision trees machine learning algorithms are used for sentiment analysis in the proposed framework. Anusha K S , Radhika A D[4],In this paper we discuss the levels, approaches of sentiment analysis, sentiment analysis of twitter data, existing tools available for sentiment analysis and the steps involved for same. Two approaches are discussed with an example which works on machine learning and lexicon based respectively. Ms. Farha Nausheen , Ms.Sayyada Hajera Begum[6],The opinion of the public for a candi- date will impact the potential leader of the country. Twitter is used to acquire a large diverse data set representing the current public opinions of the candidates. The collected tweets are analyzed using lexicon based approach to determine the sentiments of public. In this paper, we determine the polarity and subjectivity measures for the collected tweets that help in understanding the user opinion for a particular candidate. Further, a comparison is made among the candidates over the type of sentiment. Sentiment analysis can be classified into lexicon based approach, machine learning approach and hybrid approach. Sentiment analysis approaches are listed in Table 1.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 259

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Fig. 1. Process of Sentiment Analysis

3. APPROACHES IN SENTIMENT ANALYSIS This section outlines about the various steps involved in sentiment analysis and various sentiment classification approaches. Data acquisition:In this first phase collect data from various social media like Twitter, Facebook, LinkedIn etc. These Data are in unstructured format.So it is difficult to analyze the data manually.

Therefore, natural language processing is used to classify and mine the data. Data preprocessing: This phase is to clean the collected data before analysis using various data cleaning methods involves Get Rid of Extra Spaces, Select and Treat All Blank Cells, Remove Duplicates, Highlight Errors Change Text to Lower/Upper/Proper Case, Spell Check, Delete all Formatting etc. Sentiment detection: The collected sentence opinion has been examined in this phase. Subjective sentence carries more sentiment which contains beliefs, opinions and reviews have been retrained. Objective sentence contains facts and factual information has been discarded. Sentiment classification: Classify sentence into three categories positive, negative and neutral.

Table: Sentiment Classification Techniques

TYPES

APPROACHES Novel Machine Learning Approach

MERITS AND DEMERITS

MERITS : Broader term analysis DEMERITS : Limited number of

Dictionary based approach Lexicon based

words in lexicons and assigning fixed score to opinion words

Ensemble Approaches Corpus proach

based ap-

Bayesian Networks Maximum Entropy Naive Bayes ClassiMachine

fication

learning

Support Vector Ma-

ISSN:0975-887

MERITS : The capability to create trained models for particular purposes DEMERITS : The new data has low applicability and it becomes a

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 260

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

chine Neural Networks

Lexicon Hybrid

chine learning

4. APPLICATIONS OF SENTIMENT ANALYSIS Sentiment analysis is a technique which allows big compa-nies to categorize the massive amount of unstructured feedback data from social media platforms. Finding hot keywords: Opinion mining can majorly help in discovering hot search keywords. This feature can help the brand in their SEO (Search Engine Optimization). This means that opinion mining will help them make strategies about, how their brand will come up among the top results, when a trending or hot keyword is searched in a search engine. Voice of customer: Sentiment analysis of social media reviews, mentions and surveys help to broadcast the voice of customers to the brand, they are expressing their views about. This way the brand knows, exactly how common folk feels about their services. The company can use this information in growing their market, advertisement targeting and building loyalty among its customers. Employee feedback: Sentimental analysis can also be used to receive feedback from the employees of ISSN:0975-887

more applicable one only when the data has been labeled but it can be more costlier MERITS : Analysis done at the sentence level, so it shows based ma- document expressions exactly by adding or removing words in the lexicon DEMERITS : Noisy review the company and analyze their emotions and attitude towards their job. And to determine whether they are satisfied with their job or not. Better services: Text mining can provide a filter about, which service of the company is getting more negative feedback. This will help the company to know, what are the problems arising with that particular service. And based on this information the company can rectify these problems. Get to know what‘s trending: This will not only help the company to stay updated and connect more with the audience, but it will also facilitate the rise of new ideas, for developing new products. This will allow the company determine what the majority of the audience demands, and develop a product according to these demands. Feedback on pilot releases and beta versions: When a company releases a new product or service, it is released as a pilot or beta version. The monitoring of public feedback at this stage is very crucial. So, text mining from social media platforms and review sections greatly helps accelerate this process. 5. CHALLENGES FOR SENTIMENT ANALYSIS

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 261

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

The challenges in sentiment analysis are Entity Recognition - What is the person actually talking about, e.g. is 300 Spartans a group of Greeks or a movie? Classification filtering limitation-Some irrelevant opinions are filtered to determine most popular concept, which results limitation in filtering. Sentence Parsing - What is the subject and object of the sentence, which one does the verb and/or adjective actually refer to? Sarcasm - If you don‘t know the author you have no idea whether ‘bad‘ means bad or good. Twitter - abbreviations, lack of capitals, poor spelling, poor punctuation, poor grammar. 6. CONCLUSION The sentiment classification is done based on 3 different measures. These measures signify the positive, negative or neutral attitude of users towards a particular software or application, thereby enabling us to know the status of the software from the users perspective.In this paper, we have studied various approaches of many authors views which provide several challenges that arise to the sheer amount of data on web and it proves to show that the sentiment analysis is a research and very high demanding area for decision support system. REFERENCES

ISSN:0975-887

[1] S.Rajalakshmi; S.Asha; N.Pazhaniraja, A Comprehensive Survey on Sen-timent Analysis, 2017 4th International Conference on Signal Processing, Communications and Networking (ICSCN -2017), March 16 18, 2017, Chennai, INDIA [2] M.Trupthi; Suresh Pabboju; G.Narasimha. SENTIMENT ANALYSIS ON TWITTER USING STREAMING API 2017 IEEE 7th International Advance Computing Conference [3] Anuja P Jain; Asst. Prof Padma Dandannavar, Application of Machine Learning Techniques to Sentiment Analysis, 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) [4] Anusha K S; Radhika A D, A Survey on Analysis of Twitter Opinion Mining Using Sentiment Analysis, Dec-2017 International Research Jour-nal of Engineering and Technology (IRJET) [5] Juan Guevara; Joana Costal; Jorge Arroba; Catarina Silva Harvesting Opinions in Twitter for Sentiment Analysis [6] Ms. Farha Nausheen; Ms. Sayyada Hajera Begum, Sentiment Analysis to Predict Election Results Using Python, Proceedings of the Second International Conference on Inventive Systems and Control (ICISC 2018) [7] Godbole, Namrata, Manja Srinivasaiah, and Steven Skiena. ‖Large-Scale Sentiment Analysis for News and Blogs.‖ ICWSM 7.21 (2007): 219-222 [8] Mitali Desai, Mayuri A. Mehta, Techniques for Sentimental Analysis of Twitter Data: A Comprehensive Survey, International Conference on Computing, Communication and Automation (ICCCA2016). [9]Boia, Marina, et al. ‖A:) is worth a thousand words: How people attach sentiment to emoticons and words in tweets.‖ Social computing (socialcom), 2013 international conference on. IEEE, 2013.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 262

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

E – REFERENCING OF DIGITAL DOCUMENT USING TEXT SUMMARIZATION Harsh Purbiya1, Venktesh Chandrikapure2, Harshada Sandesh Karne3, Ishwari Shailendra Datar4, Prof. P. S. Teli5 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgan(Bk), Pune, India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT To have a brief look over a particular topic and searching for specific answer from a documented book or e-book is still quite hectic task. The information might be stated on eventually a different number of pages which may be ordered or random. This problem can be solved by an automated text summarization. With this system, the user which would be a student eventually only needs give the input as e-book to the system and after the information get processed, he/she is free to shoot queries. In order to achieve this, we have used machine learning, neural networks, deep learning, etc. Text summarization approaches are classified into two categories: extractive and abstractive. This paper presents the comprehensive survey of both the approaches in text summarization. Keywords Automatic Text Summarization, Extractive Summarization, Natural Language Processing (NLP), NLTK Library, Part-Of-Speech (POS). summarize over-sized documents of text. 1. INTRODUCTION There is a wealth of textual content Reading every page of the book, available on the Internet. But, usually, the memorizing each information and Internet contribute more data than is relocating it again afterwards in a short desired. Therefore, a twin problem is time span is likely not possible mostly. Seeking for appropriate Our users possibly the students don‘t have detected: time to go through hundreds of pages of documents through an awe-inspiring every book. Detailed study will obviously number of reports offered, and fascinating take a lot of time, but going through the a high volume of important information. piece again for a small information isn‘t The objective of automatic text very efficient. Time and efforts taken for summarization is to condense the origin text into a precise version preserves its that lengthy process can be invested report content and global denotation. The somewhere more fruitful. If an automated main advantage of a text summarization is system is there which can provide answers to most of their doubts in the form of reading time of the user can be reduced. A marvelous text summary system should summary then this will not only enhances the academics but also improves the reproduce the assorted theme of the document even as keeping repetition to a knowledge. To overcome this situation, Automatic minimum. Text Summarization methods are publicly restricted into abstractive and Summarization of Textual Documents can be taken into consideration. Automatic extractive summarization. This paper is divided into different Summarization has grown into a crucial sections. Section 1 is the introduction and appropriate engine for supporting and illustrate text content in the latest speedy which we have already gone through. emergent information age. It's far very Section 2 and 3 is all about the research complex for humans to physically papers that have been referenced for the ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 263

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

current work and also their comparison is done. In the later section there is the working of the proposed system and advantages, future scope and conclusion. 2. LITERATURE SURVEY In the past many different kind and versions of summarizers have been introduced and implemented. All of them are either based on abstractive summarization or extractive summarization. Few referenced papers are mentioned below. Review On Natural Language Processing :  The field of natural language processing (NLP) is deep and diverse.  The importance of NLP in processing the input text to be synthesized is reflected.  Natural language processing (NLP) is a collection of techniques used to extract grammatical structure and meaning from input in order to perform a useful task as a result, natural language generation builds output based on the Phonology, Morphology,Semantics, Pragmatics. Comparative Study of Text Summarization Methods:  Summarization has been viewed as a two step process.  The first step is the extraction of important concepts from the source text by building an intermediate representation  The second step uses this intermediate representation to generate a summary. Automatic Text Summarization and it’s Methods: There are 3 kinds of Text Summarization Systems  Abstractive vs. Extractive  Single Document vs. MultiDocuments  Generic vs. Query-based

ISSN:0975-887

3. GAP ANALYSIS For the purpose of creating an automated summarization, many research papers have been introduces in recent years. We used few of them as a reference and got educated with different facts. Review On Natural Language Processing [2013] written by Prajakta Pawar, Prof. Alpa Reshamwala and Prof. Dhirendra Mishra discussed the fact that NLP is deep and diverse also they proposed that the main output is based on phonology, morphology, semantics and pragmatics. Comparative Study of Text Summarization Methods [2014] written by Nikita Munot and Sharvari S. Govilkar, specifies that summarization has been viewed in two steps process. Graph Based Approach For Automatic Text Summarization [2016] is written by Akash, Somaiah, Annapurna. In this paper they introduces the approach to summarization via Graph and clustering techniques. 4. PROPOSED WORK This section will illustrate the purpose and complete description about the working of the system. It will also explain system constraints, interface and interactions with other external applications. System Design The two types of input are provided to the system, first is the document which gets feed in the system, it is provided by the admin and other input is query ,which is supplied by the user. The original document is duplicated and stored in the database and the copy is sent further along the system. The viability of the document is checked, i.e., if given specifications like size of file, language of e-book etc. are followed. After the first input, the document is passed from the pre-processing module where all the information is parsed, NLP algorithms are applied, proper elemental tags are generated via POS tagging or graph-based tagging is done. This parsed information is

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 264

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

saved accordingly in the particular database (RDBMS / Graph). After the preprocessing stage, the query from user will be one of the Second Input for the system. The query asked by the user will be processed using different text processing algorithms and the query is validated. If valid the system moves to the next phase. Here to enhance the working of the system; Artificial Intelligence, Deep Learning, Neural Networks, Machines Learning concepts are used to successfully identify the particular part of the e-book which consist of the output to the users query. The particular passage gets tagged.

System Inputs and Outputs This system will project a simple interface which help the User to interact with the input given by user itself in the form of digitalized document. User can actually seek for answers by giving another input as a query. This system consist of two types of input, first is the document which gets fed in the system,and other input is query. Both of these input are given by User. After the first input, the document is passed from the preprocessing module where all the information is parsed, NLP algorithms are applied, proper elemental tags are generated via POS tagging or graph-based tagging. This parsed information is saved accordingly in the Particular database (RDBMS / Graph). The processing also has its own different complex procedures. Each and every sentence, even a single word or literal needs to be parsed. For performing that ISSN:0975-887

few text processing algorithms are implemented. But to enhance the working of the system, Artificial Intelligence, Deep Learning, Neural Networks, Machines Learning concepts are mandatory too. When the preprocessing stage is over, the user can now toggle the queries for which he expects a automates summary. This query of user will be one of the Second Input for the system. We are taking this under consideration that the user will search for more than one query to satisfy his thirst of information. Here for multiple second input, There will be different multiple output.

After the second input of the user the system processes that input as a query. The whole document is inspected in a brief and efficient manner so an informative summary can be generated in response to that particular query. Yet this system has some limitation and bounds. Technical and computation related documents are not entertained. Only particular theoretical ebooks or documents shall be preprocessed. User group is also very much specific, i. e. Particular Students or schools or colleges.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 265

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

System Functions With the help of the interface that will be provided to handle the system, the user can get benefits by getting the informative summary for his/her question or query over a document. This document must pass all the conditions before preprocessing. The conditions are, the document that is to be fed needs to be theoretical completely, there should be no technical computational or mathematical proof part, etc. The rectification of these conditions is a future scope. The user can also toggle the newly generated interactive document. Each and every query user asks in a session will be saved to enhance the user experience and also so that the system will learn from it for delivering better results in the future. User Characteristics As for now the user category will only stick to the students. The role of students will be of a stack holder as they will be getting benefits from this system, in gaining knowledge or in their academic session. Student will take a document and will feed that in the system and again as the second input he/she will give their query to the system on the interface and the system will give its output in the for of summary from that document. At the initial stage even the category of student will be neutralised and their level will be winged off. Looking at the future scope in user category, even the teachers can use such a system for their betterment and time saving or say productivity. 5. ADVANTAGES The system provides summary in a general automated manner. This simply means that the output will be very much simplified and in a easy to read and understand manner. Re-reading for finding a particular conclusion is banished. Sometimes a lot of rework is required to find a particular information from a chunk of gigantic

ISSN:0975-887

literature. Such a hectic task would be normalised. Time saved means more time for quality work. A lot time that was being consumed in research work would be minimized and there leading to betterment of different parameters. The system facilitates the ease of grasping the crucial topics in a go. Sometimes what happens is, we ignore some of the most important aspects or points from a particular document while examining it or reading it. It is easy to understand and requires hardly any setup. The out of the procedure is simple, readable and understandable. A high degree of abstraction is attained. The whole system designed is kept in backside so the primary actor of course shouldn‘t get confuse of the methodologies and technicalities used. 6. FUTURE WORK In today‘s date, the rate at which the data is increasing day by is vicious. By data we not only mean the visual data but also the textual data. Suppose you work in a law firm and you need to verify few documents and those documents have sub parts of 6000 scripts each. Such a work for a single person or even a group is timid. Another example can be taken of the business related books and documents. Business doesn‘t comes in a single book but in experience and that experience is scripted in documents which are delivered regularly. To study all these documents is literally impossible. With the help of the automated summarization and E-referencing an individual will get help in finding the right information from these large chunks. In future our system can be used by different schools and institutional organizations for examining purposes and also for teaching the right content to the students.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 266

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

7. CONCLUSION In this paper, we have presented a way in which the automated generated summarization can performed and be used for various fundamental purposes. The procedure that we proposed consist of few steps. First we need to take an input from the user which would be inn the form of digital document. Our system would scan that document and get it preprocessed for the later steps. The next step would be applying different algorithms for the purpose of summarization and information retrieval. This may include scanning, parsing, POS tagging and different technical measures. Now the original document will be saved in the database along with the output that we got after the preprocessing i.e., the summary. After this, the user will give second input to the system, that would be the query that needed to be looked for. The system will take that query and look for the appropriate solution for that. At last an output in the form of text is generated and is provided to the user.

ISSN:0975-887

REFERENCES [1] Review On Natural Language Processing Prajakta Pawar, Prof. Alpa Reshamwala and Prof. Dhirendra Mishra Cite As : https://www.researchgate.net/publication/235 788362 | Published in An International Journal (ESTIJ), ISSN: 2250-3498, Vol.3, No.1, February 2013 [2] Comparative Study of Text Summarization Methods - Nikita Munot andSharvari S. Govilkar Cite As : https://pdfs.semanticscholar.org/0c95/0bc8f2 34ecb6cf57f13bca7edd118809d0ca.pdf | Published in: International Journal of Computer Applications (0975 – 8887) Volume 102– No.12, September 2014. [3] Automatic Text Summarization and it‘s Methods - Neelima Bhatiya and Arunima Jaiswal Cite As : https://ieeexplore.ieee.org/abstract/document/7 508049/ | Published in 2016 6th International Conference [4] Graph Based Approach For Automatic Text Summarization - Akash, Somaiah, Annapurna Cite As : https://ijarcce.com/wpcontent/uploads/2016/11/IJARCCEICRITCSA-2.pdf | Published in International Journal of Advanced Research in Computer and Communication Engineering, Vol. 5, Special Issue 2, October 2016

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 267

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

ONLINE SHOPPING SYSTEM WITH STITCHING FACILITY Akshada Akolkar1, Dahifale Manjusha2, Chitale Sanchita3 1,2,3

Dept. of computer engineering, S.C.S.M.C.O.E, Nepti, Ahmednagar, Maharashtra, India. [email protected], [email protected], [email protected]

ABSTRACT Online shopping is a form of electronic commerce which allows consumers to directly buy goods or services from a seller over the internet using a web browser. Consumers find a product of interest by visiting the website of the retailer directly or by searching among alternative vendors using a shopping search engine, which displays the same product‘s availability and pricing at different e-retailers. The Proposed web application would be attractive enough, have a professional look and user friendly. The online shopping is a web based application intended for online retailers. The main goal of this system is to make it interactive and its ease of use. It would make searching, viewing and selection of the product easier. The user can then view the complete specification of each product. The application also provides a drag and drop features, so that a user can add a product to shopping cart by dragging the items into the shopping cart. The Main aim of the project is to automate the tailoring sector which is manually maintained. After the automation this will provide better services such as fitting facility and also paperless environment, and quick search, data integrity and security. Keywords Shopping process, E-Commerce and mining, Web mining, Website reorganization, Improved Mining, Consumer buying behaviour. As the tailors work manually, the whole 1. INTRODUCTION process tends to be slow. Customers too E-commerce is fast gaining ground as an have no prior information on cost of accepted and used business paradigm. More and more business houses are netting their garments. So Proposed implementing web sites providing system is a system aimed to assist in management of tailoring activities within functionality for performing commercial the industry. It will provide online services transactions over the web. It is reasonable to customers such as: measurement to say that the process of shopping on the web is becoming commonplace. The submission to their tailors, check whether their garments are finished and also help objective of this project is to develop a proper keeping of records. The availability general purpose e-commerce store where of right information, information safety, product like clothes can be bought from easy storage, access and retrieval will be the comfort of home through the Internet. ensured. However, for implementation purposes, this paper will deal with an online 2. RELATED WORK shopping for clothes. Tailors use traditional manual systems to Currently customers have to walk book in their clients. The clients have to to the tailor shops to get their measurements taken for the tailoring of travel to location of the tailor shop to get their measurement taken. These their garments. Their details are taken and measurements are written on paper or kept on papers. Customers need to take books. This system will solve all these some time out from their busy schedule problems and automate the tailor shops and visit the tailor. This is time and costly. ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 268

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

and enhance accessibility irrespective of geographical locations provided there is internet. 3. PROPOSED WORK The proposed system will automate the current manual tailoring system and maintain a searchable customer, product database, maintain data security and user rights. Here system will enable customers to send their measurements to their tailors for their clothes to be made. Also this will provide information about the cost, the fabric type, the urgency at which a customer wants the dress finished, the type of material to be used and quantity in terms of pairs needed. To compute the total cost depending on the selected fabric, type of material, quantity and duration and avails that information to the customer. This enable report generation: it is able give a report of finished the garments to

the clients for collection and bookings made, administrator is able to view all the customers and their details, finished garments and all the booking made. To create a data bank for easy access a retrieval of customer details, orders placed and the users who register to the system. The registration process for the customers is provided online by the system which will help to successfully submit their measurements. The system has inbuilt validation system to validate the entered data. The customer can login to the system to check on the status of the clothes for collection. The system will show the already completed garments for client to collect. The system also provides information about the cost of each garment the customer intents to get knit. The data will be store in the database for further reference or audit.

Figure1 Use Case Diagram

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 269

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

4. ADVANTAGES 1. E- Commerce has changed our life styles entirely because we do not have to spend time and money travelling to the market. 2. It is one of the cheapest means of doing business as it is e-commerce development. 3. This will ensure availability of right information, information safety, easy storage, access and retrieval. 4. This will eliminate all the manual interventions and increase the speed of the whole process. 5. It provides better services good keeping of records, data integrity, data security, quick search and also paperless environment. 5. CONCLUSION The main reason behind the establishment of Online shopping system with stitching facility is to enable the customer and administrator in a convenient, fair and timely manner of interaction. Therefor the IT used by whoever uses the system should support the core objective of the system if it is to remain relevant. This may involve training of the staff on how to enter right and relevant data into system and management to keep updating the hardware and software requirement of the system. IT and computer system need to be kept being upgraded as more and more IT facilities software are introduced in today‘s IT market. The researcher acknowledges the fact that this system does not handle all staff the tailor shop have like the asset section and staff members in the tailor shop. The researcher therefore suggests that for further research into building a system that capture all fields as pertains the tailor shop. 6. AKNOWLEDGEMENT First and foremost, we would like to thank our guide, Prof. Pawar S.R. for his ISSN:0975-887

constant guidance and support. We will forever remain grateful for the constant support and guidance extended by guide, in making this report. Through our many discussions, she helped us to form and solidify ideas. The invaluable discussions we had with her, the penetrating questions he has put to us and the constant motivation, has all led to the development of this project. We would like to convey our sincere and heart rendering thanks to Principal Dr. Deshpande R.S. for his cooperation, valuable guidance. Also we wish to express our sincere thanks to the Head of department, Prof. Lagad J.U. for their support. REFERENCES [1] Anand Upadhyay, Ambrish Pathak, Nirbhay Singh, ―Evolution of Online Shopping: ECommerce‖, International journal of Commerce and Management Research, June 2017 [2] Neha Verma, Prof.(Dr.)Jatinder Singh, ―Improved Web Mining for E-Commerce Website Restructuring‖ ,2015 IEEE [3] Ifeoma Adaji and Julita Vassileva, ―Tailoring Persuasive Strategies in E-Commerce‖ , Persuasive Technology 2017 [4] Subramani Mani and Eric Walden, ―The Impact of E-Commerce Announcements on the Market Value of Firms , ‖ Information System Research, Vol.12, Issue.2, pp.135-154,2001. [5] Shahrzad Shahriari, Mohammadreza Shahriari and Saeid Gheiji, ―E-Commerce and It impact on Global Trend and Market‖, International Journal of Research, Vol.3, Issue.4, pp.49-55 2015 [6] Menal Dahiya, ―Study on E-Commerce and its Impact on market and Retailers in India‖ , Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 5(2017) pp. 1495-1500 [7] Shen Zihao, Wang hui, ―Research on Ecommerce Application based on Web mining‖, Proceedings of IEEE International Conference on Intelligent Computing and cognitive Informatics, 2010, pp.337-340, DOI 10.1109/ICICCI 2010.89 [8] Zhiwu Liu, Li Wang, ―Study of Data Mining technology used for E-Commerce‖, Proceedings of IEEE Third International Conference on Intelligent network And Intelligent Systems, 2010, pp.509-512, DOI 10.1109/ICINIS.2010.61

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 270

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[9] Babita Saini, ―E-Commerce in India‖, Proceeding of International Journal Of Business and Management, ISSN 2321-8916, vol.2, Issue 2, pp.1-5, Feb 2014 [10] Latika Tamrakar, S.M.Ghosh, ―Identification of Frequent Navigation Pattern Using Web Usage Mining‖, International Journal of Advance Research in Computer Science And Technology (IJRCST), ISSN 2347- 9817 , Vol. 2, Issue 2, Ver 2, April-June 2014, pp.296-299. [11] Bhupinder Singh, Usvir Kaur, Dr. Dheerebdra Singh, ―Web usage Clustering Algorithms: A Review‖, International Journal of Latest Scientific Research and Technology, ISSN 2348-9464, July 2014, pp.1-7. [12] Adaji, I., Vassileva, J.; Evaluating Personalization and Persuasion in ECommerce. Proc. Int. Work. Pers. Persuas. Technol.(2016), [13] Damanpour, Faramarz, Jamshid Ali Damanpour. E-business e-commerce evolution; perspective and strategy, Managerial finance. 2001;27(7):16-33. [14] Gunesekaran A et al. E-Commerce and its impact on operation management. International Journal of Production economics. 2002; 75(1):185-197. [15] J. Tian . Software quality Engineering – Testing, Quality Assurance and quantifiable improvement, IEEE Computer Society. [16] Xia Wang, Ke Zhang, Qingtian Wu, ―A Design Of Security Assessment System for Ecommerce Website‖, 2015 8th International

ISSN:0975-887

Symposium on Computational Intelligence and Design [17] Ning Luo,Jungang Xu. Application of Web data mining in E-commerce [J]. Electronic technology .2012,4:005. [18] Jinyong Liu. WEB data mining research application in E-commerce [J]. Network security technology and Application, 2013 (9):25-26 [19] Li Kan, Hao Pan. Application of Web data mining technology in E-commerce [J]. Computer Knowledge and technology, 2010, 4: 816-81. [20] Yonghua Zhao, Hong Lin, ―WEB data mining applications in e-commerce‖, The 9th International Conference on Computer Science & Education(ICCSE 2014) August 2224,2014. Vancouver, Canada. [21] Hye Young Lee, Minwoo Lee, Moon-Gil Yoon, ― Website Development Strategy for ecommerce Success‖. [22] Nor Haimimy Rawi, Marini Abu Bakar, Rokiah Bahari and Abdullah Mohd Zin, ―Development Environment for Layout Design of e-commerce Applications Using BlockBased Approach‖, 2011 International Conference on Electrical Engineering and Informatics 17-19 July 2011, Bandung, Indonesia. [23] Chuan Lin, ―The Evolution of Ecommerce Payment‖, Technology and Investment ,2017, 8, 56-66, DOI:10.4236/ti.2017.81005.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 271

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

A SURVEY ON ONLINE MEDICAL SUPPORT SYSTEM Shivani J. Sawarkar1, G.R. Shinde2 1,2

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected], [email protected]

ABSTRACT In our society, humans pay more attention to their own fitness. Personalized fitness service is regularly rising . Due to lack of skilled doctors and physicians, maximum healthcare corporations cannot meet the clinical call of public. Public want extra accurate and on the spot result. Thus, increasingly more facts mining packages are evolved to provide humans extra custom designed healthcare provider. It is a good answer for the mismatch of insufficient clinical assets and growing medical demands. Here an AI-assisted prediction device is advocated which leverages information mining strategies to show the relationship between the everyday physical examination information and the capability fitness danger given through the consumer or public. The main concept is to decide clinical illnesses in step with given signs and symptoms & every day Routine where user search the sanatorium then given the closest medical institution of their cutting-edge area. Keywords: Data mining, machine learning and disease prediction. lack of effective analysis tools to discover 1. INTRODUCTION Many healthcare organizations (hospitals, hidden relationships and trends in data. medical centers) in China are busy in This process is inefficient, as each serving people with best-effort healthcare suspicious access has to be reviewed by a service. Nowadays, people pay more security expert, and is purely retrospective, attention on their physical conditions. as it occurs after damage may have been They want higher quality and more incurred.[1] Data mining is suitable for personalized healthcare service. However, processing large datasets from hospital with the limitation of number of skilled information system and finding relations doctors and physicians, most healthcare among data features.The list of challenges organizations cannot meet the need of in order of importance that they be solved public. How to provide higher quality if patients and organizations are to begin healthcare to more people with limited realizing the fullest benefits possible of manpower becomes a key issue. The these systems consists of: improve the healthcare environment is generally human–computer interface; disseminate perceived as being ‗information rich‘ yet best practices in CDS design, ‗knowledge poor‘. Hospital information development, and implementation; systems typically generate huge amount of summarize patient-level information; data which takes the form of numbers, prioritize and filter recommendations to text. . There is a lot of hidden information the user; create an architecture for sharing in these data untouched. Data mining and executable CDS modules and services; predictive analytics aim to reveal patterns combine recommendations for patients and rules by applying advanced data with co-morbidities; prioritize CDS analysis techniques on a large set of data content development and implementation; for descriptive and predictive purposes. create internet-accessible clinical decision There is a wealth of data available within support repositories; use free text the healthcare systems. However, there is a information to drive clinical decision ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 272

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

support; mine large clinical databases to create new CDS[2] It takes only a few researchers to analyze data from hospital information.. Knowledge discovery and data mining have found numerous applications in business and scientific domain.[3]The main concept is to determine medical diseases according to given symptoms & daily routine when user search the hospital then given the nearest hospital of their current location. Data mining techniques used in the prediction of heartattacks are rule based, decision trees,attificial neural networks.[4] The related queries are based in previously issued queries, and can be issued by the user to the search engine to tune or redirect the search process. The method proposed is based on a query clustering process in which groups of semantically similar queries are identified [5]. The clustering process uses the content of historical preferences of users registered in the query log of the search engine The system provides a user-friendly interface for examinees and doctors. Examinees can know their symptoms which accrued in body which set as the while doctors can get a set of examinees with potential risk. A feedback mechanism could save manpower and improve performance of system automatically. 2. MOTIVATION Previous medical examiner only used basic symptoms of particular diseases but in this application examiner examines on the word count, laboratory results and diagnostic data. A feedback mechanism could save manpower and improve performance of system automatically. The doctor could fix prediction result through an interface, which will collect doctors‘ input as new training data. An extra training process will be triggered everyday using these data. Thus, this system could improve the performance of prediction model automatically. When the user visits hospital physically, then user‘s personal record is saved and then that record is ISSN:0975-887

added to the examiner data set. It consumes lot of time. 3. REVIEW OF LITERATURE Sittig D, Wright A, Osheroff J, et al. [1] There is a pressing need for high-quality, effective means of designing, developing, presenting, implementing, evaluating, and maintaining all types of clinical decision support capabilities for clinicians, patients and consumers. Using an iterative, consensus-building process we identified a rank-ordered list of the top 10 grand challenges in clinical decision support. This list was created to educate and inspire researchers, developers, funders, and policy-makers. The list of challenges in order of importance that they be solved if patients and organizations are to begin realizing the fullest benefits possible of these systems consists of: improve the human–computer interface; disseminate best practices in CDS design, development, and implementation, summarize patient-level information; prioritize and filter recommendations to the user; create an architecture for sharing executable CDS modules and services; combine recommendations for patients with co-morbidities; prioritize CDS content development and implementation; create internet-accessible clinical decision support repositories; use free text information to drive clinical decision support; mine large clinical databases to create new CDS. Identification of solutions to these challenges is critical if clinical decision support is to achieve its potential and improve the quality, safety and efficiency of healthcare. Anderson J E, Chang DC. Et al [2] Many healthcare facilities enforce security on their electronic health records (EHRs) through a corrective mechanism: some staff nominally have almost unrestricted access to the records, but there is a strict ex post facto audit process for inappropriate accesses, i.e., accesses that violate the facility‘s security and privacy

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 273

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

policies. This process is inefficient, as each suspicious access has to be reviewed by a security expert, and is purely retrospective, as it occurs after damage may have been incurred. This motivates automated approaches based on machine learning using historical data. Previous attempts at such a system have successfully applied supervised learning models to this end, such as SVMs and logistic regression. While providing benefits over manual auditing, these approaches ignore the identity of the users and patients involved in a record access. Therefore, they cannot exploit the fact that a patient whose record was previously involved in a violation has an increased risk of being involved in a future violation. Motivated by this, in this paper, a collaborative filtering inspired approach to predict inappropriate accesses is proposed. Our solution integrates both explicit and latent features for staff and patients, the latter acting as a personalized ―fingerprint‖ based on historical access patterns. The proposed method, when applied to real EHR access data from two tertiary hospitals and a file-access dataset from Amazon, shows not only significantly improved performance compared to existing methods, but also provides insights as to what indicates an inappropriate access. ZhaoqianLan, Guopeng Zhou, YichunDuan , Wei Yan et al[3] healthcare environment is generally perceived as being ‗information rich‘ yet ‗knowledge poor‘. There is a wealth of data available within the healthcare systems. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. Knowledge discovery and data mining have found numerous applications in business and scientific domain. Valuable knowledge can be discovered from application of data mining techniques in healthcare system. In this study, the potential use of classification based data mining techniques such as rule based, ISSN:0975-887

decision tree, naïve bayes and artificial neural network to massive volume of healthcare data is briefly examined. The healthcare industry collects huge amounts of healthcare data which, unfortunately, are not ―mined‖ to discover hidden information. For data preprocessing and effective decision making One Dependency Augmented Naïve Bayes classifier (ODANB) and naive creedal classifier 2 (NCC2) are used. This is an extension of naïve Bayes to imprecise probabilities that aims at delivering robust classifications also when dealing with small or incomplete data sets. Discovery of hidden patterns and relationships often goes unexploited. Using medical profiles such as age, sex, blood pressure and blood sugar it can predict the likelihood of patients getting a heart disease. It enables significant knowledge, e.g. patterns, relationships between medical factors related to heart disease, to be established. Srinivas K, Rani B K, Govrdhan A. et al[4].In this paper, care services through telemedicine is provided and it has become an important part of the medical development process, due to the latest innovation in the information and computer technologies. Meanwhile, data mining, a dynamic and fast-expanding domain, has improved many fields of human life by offering the possibility of predicting future trends and helping with decision making, based on the patterns and trends discovered. The diversity of data and the multitude of data mining techniques provide various applications for data mining, including in the healthcare organization. Integrating data mining techniques into telemedicine systems would help improve the efficiency and effectiveness of the healthcare organizations activity, contributing to the development and refinement of the healthcare services offered as part of the medical development process.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 274

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Gheorghe M, Petre R. et al[5] In this paper a method is proposed that, given a query submitted to a search engine, suggests a list of related queries. The related queries are based in previously issued queries, and can be issued by the user to the search engine to tune or redirect the search process. The method proposed is based on a query clustering process in which groups of semantically similar queries are identified. The clustering process uses the content of historical preferences of users registered in the query log of the search engine. The method not only discovers the related queries, but also ranks them according to a relevance criterion. Finally, with experiments over the query log of a search engine is shown andthe effectiveness of the method. R. Baeza-Yates, C. Hurtado, and M. Mendoza,et al[6], thesystem have focused to compare a variety of techniques, approaches and different tools and its impact on the healthcare sector. The goal of data mining application is to turn that data are facts, numbers, or text which can be processed by a computer into knowledge or information. The main purpose of data mining application in healthcare systems is to develop an automated tool for identifying and disseminating relevant healthcare information. This paper aims to make a detailed study report of different types of data mining applications in the healthcare sector and to reduce the complexity of the study of the healthcare data transactions. Also presents a comparative study of different data mining applications, techniques and different methodologies applied for extracting knowledge from database generated in the healthcare industry. Finally, theexisting data mining techniques with data mining algorithms and its application tools which are more valuable for healthcare services are discussed in detail.

ISSN:0975-887

Koh H C, Tan G.et al [7] many healthcare facilities enforce security on their electronic health records (EHRs) through a corrective mechanism: some staff nominally have almost unrestricted access to the records, but there is a strict ex post facto audit process for inappropriate accesses, i.e., accesses that violate the facility‘s security and privacy policies. This process is inefficient, as each suspicious access has to be reviewed by a security expert, and is purely retrospective, as it occurs after damage may have been incurred. This motivates automated approaches based on machine learning using historical data. Previous attempts at such a system have successfully applied supervised learning models to this end, such as SVMs and logistic regression. While providing benefits over manual auditing, these approaches ignore the identity of the users and patients involved in a record access. Therefore, they cannot exploit the fact that a patient whose record was previously involved in a violation has an increased risk of being involved in a future violation. Motivated by this, in this paper, a collaborative filtering inspired approach to predict inappropriate accesses is proposed. The solution integrates both explicit and latent features for staff and patients, the latter acting as a personalized ―finger-print‖ based on historical access patterns. The proposed method, when applied to real EHR access data from two tertiary hospitals and a file-access dataset from Amazon, shows not only significantly improved performance compared to existing methods, but also provides insights as to what indicates an inappropriate access. Tao Jiang & Siyu Qian, et al. [8]The study aimed to identify risk factors in medication management in Australian residential aged care (RAC) homes. Only 18 out of 3,607 RAC homes failed aged care accreditation standard in medication management between 7th March 2011 and 25th March 2015. Text data mining methods were used

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 275

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

to analyse the reasons for failure. This led to the identification of 21 risk indicators for an RAC home to fail in medication management. These indicators were further grouped into ten themes. They are overall medication management, medication assessment, ordering, dispensing, storage, stock and disposal, administration, incident report, monitoring, staff and resident satisfaction. The top three risk factors are: ―ineffective monitoring process‖ (18 homes), ―noncompliance with professional standards and guidelines‖ (15 homes), and ―resident dissatisfaction with overall medication management‖ (10 homes). Song J H, Venkatesh S S, Conant E A, et al. [9], the k-means clustering and selforganizing maps (SOM) are applied to analyze the signal structure in terms of visualization. k-nearest neighbor classifiers (k-nn), support vector machines (SVM) and decision trees (DT) are employed to classify features using a computer aided diagnosis (CAD) approach. Song J H, Venkatesh S S, Conant E A, et al. [10],Breast cancer is one of the most common cancers in women. Sonography is now commonly used in combination with other modalities for imaging breasts. Although ultrasound can diagnose simple cysts in the breast with an accuracy of 96%–100%, its use for unequivocal differentiation between solid benign and malignant masses has proven to be more difficult. Despite considerable efforts toward improving imaging techniques, including solography, the final confirmation of whether a solid breast lesion is malignant or benign is still made by biopsy. V. Akg¨un, E. Erkut, and R. Batta.et al [11] considers the problem of finding a number of spatially dissimilar paths between an origin and a destination. A number of dissimilar paths can be useful in ISSN:0975-887

solving capacitated flow problems or in selecting routes for hazardous materials .A critical disscussion of three existing methods for the generation of spatially dissimilar paths is offered and computational experience using these methods is reported. As an alternative method, the generation of a large set of candidate paths and the selection of a subset using a dispersion model which maximizes the minimum dissimilarity in the selected subset is proposed. T. Akiba, T. Hayashi, N. Nori, Y. Iwata, and Y. Yoshida.et al [12]. An indexing scheme for top-k shortest path distance queries on graphs, which is useful in a wide range of important applications such as network aware searches and link prediction is proposed. While many efficient methods for answering standard (top-1) distance queries have been developed, none of these methods are directly extensible to top-k distance queries. A new framework for top-k distance queries based on 2-hop cover is developed and then present an efficient indexing algorithm based on the recently proposed pruned landmark labeling scheme. The scalability, efficiency and robustness of the method is demonstrated in extensive experimental results. A. Angel and N. Koudas.et al [13].diversity-aware search, in a setting that captures and extends established approaches, focusing on content-based result diversification is studied. DIVGEN is presented, an efficient threshold algorithm for diversity-aware search. DIVGEN utilizes novel data access primitives, offering the potential for significant performance benefits. The choice of data accesses to be performed is crucial to performance, and a hard problem in its own right. Thus a low-overhead, intelligent data access prioritization scheme is proposed, with theoretical quality guarantees, and good performance in practice..

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 276

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

H. Bast, D. Delling, A. V. Goldberg, M. M¨uller-Hannemann, T. Pajor,P. Sanders, D. Wagner, and R. F. Werneck et al [14]. Survey is done in recent advances in algorithms for route planning in transportation networks. For road networks, it is shown that one can compute driving directions in milliseconds or less even at continental scale. A variety of techniques provide different trade-offs between preprocessing effort, space requirements, and query time. Some algorithms can answer queries in a fraction of a microsecond, while others can deal efficiently with real-time traffic. Borodin, Allan, Lee, H. Chul, Ye, and Yuliet al. [15].Result diversification is an important aspect in web-based search, document summarization, facility location, portfolio management and other applications. Given a set of ranked results for a set of objects (e.G. Web documents, facilities, etc.) With a distance between any pair, the goal is to select a subset S satisfying the following three criteria: (a) the subset S satisfies some constraint (e.G.Bounded cardinality); (b) the subset contains results of high ―quality‖; and (c) the subset contains results that are ―diverse‖ relative to the distance measure. The goal of result diversification is to produce a diversified subset while maintaining high quality as much as possible 4. OPEN ISSUES According to Survey, system leverages data mining methods to reveal the relationship between the regular physical examination records and the potential health risk. It can predict examinees‘ risk

ISSN:0975-887

of physical status next year based on the physical examination records this year. Examinees can know their potential health risks while doctors can get a set of examinees with potential risk. It is a good solution for the mismatch of insufficient medical resources and rising medical demands. They apply various supervised machine learning methods, including decision tree, XG Boost to predict potential health risks of examinees using their physical examination records.Examinees can know their symptoms which occured in body which set as the (potential health risks according) while doctors can get a set of examinees with potential risk. 5. PROPOSED SYSTEM After the analysis of the previous system this ssystem‘s main concept is to determine medical diseases according to given symptoms & daily routine and when user search the hospital then the nearest hospital of their current location is given. The system provides a user-friendly interface for examinees and doctors. Examinees can know their symptoms which occured in body while doctors can get a set of examinees with potential risk. A feedback mechanism could save manpower and improve performance of system automatically. The doctor could fix prediction result through an interface, which will collect doctors‘ input as new training data. An extra training process will be triggered everyday using these data. Thus, our system could improve the performance of prediction model automatically.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 277

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Preprocessing Search

Show Result

Clean Data

DataBase Select Doctor

Given Systoms Given Result Logical Part

Machine Learning

Random Forest

To Predication Result. Baseline Algorithm

User

User Search Keyword

Get Current Location

Seen Current Hospital

Figure 1: System Overview

This system increases human-computer interactions. Location of user is detected. Also the hospital and doctor is recommended to the patient according to the prediction of the disease. Medicines are provided for the predicted disease. This prediction system is fast, scalable and lowcost. 6. ALGORITHMS Random Forest Algorithm: The beginning of random forest algorithm starts with randomly selecting ―k‖ features out total ―m‖ features. In the image, it is seen that features and observations are randomly taken. In the next stage, the randomly selected ―k‖ features are used to find the root node by using the best split approach. In the next stage, the daughter nodes are calculated using the same best split approach. The first 3 stages until we form the tree with a root node and having the target as the leaf node. Finally, 1 to 4 stages are repeated to create ―n‖ randomly created trees. This randomly created tree forms the random forest. Partition based Algorithm: Implement the algorithm and test itand instrument the algorithm so that it counts the number of comparisons of array elements. (Don‘t count comparisons between array indices.) Test it to see if the counts ―make sense‖. For values of n from 500 to 10000,

ISSN:0975-887

run a number of experiments on randomlyordered arrays of size n and find the average number of comparisons for those experiments. Graph the average number of comparisons as a function of n and repeat the above items 1–4, using an alternative pivot selection method. Baseline algorithm: Implement the algorithm and test it to find the RWR based top-m query recommend. Start from one unit of active ink injected into node Kq and the order in descending order. Find the weight of each edge e is adjusted based on q.The algorithm returns the top-m candidate suggestionsother than kq in C as the result. 7. CONCLUSION This project implements an AI-assisted prediction system which leverages data mining methods to reveal the relationship between the regular physical examination records and the potential health risk given by the user or public. Different machine learning algorithms are applied to predict physical status of examinee will be in danger of physical deterioration next year. In this system user or patient search the hospital, then results are given according to the nearest location of current location of user/patients. User / Patients gives symptoms and the system will predict the diseases and will give the medicines. A feedback mechanism is also designed for doctors to fix classification result or input new training data, and the system will automatically rerun the

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 278

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

training process to improve performance every day. 8. ACKNOWLEDGEMENT The authors would like to thank the researchers as well as publishers for making their resources available and teachers for their guidance. We are thankful to the authorities of Savitribai Phule University of Pune and concern members of ICINC 2019 conference, for their constant guidelines and support. We are also thankful to the reviewer for their valuable suggestions. We also thank the college authorities for providing the required infrastructure and support. Finally, we would like to extend a heartfelt gratitude to friends and family members. REFERENCES [1] Sittig D, Wright A, Osheroff J, et al. ―Grand [2]

[3]

[4]

[5]

[6]

challenges in clinical decision support‖. Journal of Biomedical Informatics, 2008. Anderson J E, Chang D C. ―Using Electronic Health Records for Surgical Quality Improvement in the Era of Big Data‖[J]. Jama Surgery, 2015. ZhaoqianLan, Guopeng Zhou, YichunDuan , Wei Yan , ―AI-assisted Prediction on Potential Health Risks with Regular Physical Examination Records‖, IEEE Transactions On Knowledge And Data Science ,2018. Srinivas K, Rani B K, Govrdhan A. ―Applications of Data Mining Techniques in Healthcare and Prediction of Heart Attacks‖. International Journal on Computer Science & Engineering, 2010. Gheorghe M, Petre R. ―Integrating Data Mining Techniques into Telemedicine Systems‖ InformaticaEconomica Journal, 2014. R. Baeza-Yates, C. Hurtado, and M. Mendoza, ―Query recommendation using query logs in search engines,‖ in Proc. Int. Conf. Current

ISSN:0975-887

Trends Database Technol., 2004, pp. 588–596.

[7] Koh H C, Tan G. Data mining applications in healthcare.[J]. Journal of Healthcare Information Management Jhim, 2005, 19(2):64-72. [8] Menon A K, Jiang X, Kim J, et al. Detecting Inappropriate Access to Electronic Health Records Using Collaborative Filtering[J]. Machine Learning, 2014, 95(1):87-101. [9] Tao Jiang & Siyu Qian, et al. Accreditation Reports to Identify Risk Factors in Medication Management in Australian Residential Aged Care Homes[J]. Studies in Health Technology & Informatics, 2017 [10] Song J H, Venkatesh S S, Conant E A, et al. Comparative analysis of logistic regression and artificial neural network for computeraided diagnosis of breast masses.[J]. Academic Radiology, 2005, 12(4):487-95. [11] V. Akg¨un, E. Erkut, and R. Batta. On finding dissimilar paths. European Journal of Operational Research, 121(2):232–246, 2000. [12] T. Akiba, T. Hayashi, N. Nori, Y. Iwata, and Y. Yoshida. Efficient topk shortest-path distance queries on large networks by pruned landmark labeling. In Proc. AAAI, pages 2–8, 2015. [13] A. Angel and N. Koudas. Efficient diversity-aware search. In Proc. SIGMOD, pages 781–792, 2011. H. Bast, D. Delling, A. V. Goldberg, M. M¨uller-Hannemann, T. Pajor,P. Sanders, D. Wagner, and R. F. Werneck. Route planning in transportation networks. In Algorithm Engineering, pages 19–80. 2016. [14] H. Bast, D. Delling, A. V. Goldberg, M. M¨uller-Hannemann, T. Pajor,P. Sanders, D. Wagner, and R. F. Werneck. Route planning in transportation networks. In Algorithm Engineering, pages 19–80. 2016. [15] Borodin, Allan, Lee, H. Chul, Ye, and Yuli. Max-sum diversification, monotone submodular functions and dynamic updates. Computer Science, pages 155–166, 2012.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 279

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

NATURAL LANGUAGE QUESTION ANSWERING SYSTEM USING RDF FRAMEWORK Maruti K. Bandgar1, Avinash H. Jadhav2, Ashwini D. Thombare3, Poornima D. Asundkar4, Prof.P.P.Patil5 1,2,3,4,5

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk), Pune, India. [email protected], [email protected] 2, [email protected], [email protected], [email protected]

ABSTRACT To answer a natural language question, the existing work takes a two-stage approach: question understanding and query evaluation. Their focus is on question understanding to deal with the disambiguation of the natural language phrases. The most common technique is the joint disambiguation, which has the exponential search space. In this paper, we propose a systematic framework to answer natural language questions over RDF repository (RDF Q/A) from a graph data-driven perspective. We propose a semantic query graph to model the query intention in the natural language question in a structural way, based on which, RDF Q/A is reduced to sub graph matching problem. More importantly, we resolve the ambiguity of natural language questions at the time when matches of query are found. The cost of disambiguation is saved if there are no matching found. More specifically, we propose two different frameworks to build the semantic query graph, one is relation (edge)-first and the other one is node-first. We compare our method with some state-of-the-art RDF Q/A systems in the benchmark dataset. Extensive experiments confirm that our method not only improves the precision but also speeds up query performance greatly. A typical knowledge-based question answering (KB-QA) system faces two challenges: one is to transform natural language questions into their meaning representations (MRs). Key Words: RDF,Q/A, N,Q for SPARQL query. semantic query information using 1. INTRODUCTION The proposed system focus is on knowledge graph best answer, because of question Understanding to deal with the this we can overcome problems occur in disambiguation of the natural language real-time application such as Quora and Used to get disambiguation, which has the Stack Overflow. exponential search space. We propose a framework to answer natural language 2 .LITERATURE SURVEY questions over RDF repository from a [1] ―Knowledge-based question graph data-driven technique. We propose a answering as machine translation‖ semantic query graph to model the query A typical knowledge-based question knowledge in the natural language answering (KB-QA) system faces two question in a structural way, Resource challenges: one is to transform natural Description Framework Question and language questions into their meaning answering is main use reduced to sub representations (MRs); the other is to graph matching problem. More retrieve answers from knowledge bases importantly, we resolve the ambiguity of (KBs) using generated MRs. Used to natural language questions at the time presents a translation-based KBwhen matches of query are found. The cost QAmethod that integrates semantic of disambiguation is saved if there are no parsing and QA in one unified framework matching found. In our system we use this System faces challenges: To transform ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 280

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

natural language questions into their meaning representations (MRs); [2] ―Robust question answering over the web of linked data‖ Knowledge bases and the Web of Linked Data have become important assets for search, recommendation, and analytics. Natural-language questions are a userfriendly mode of tapping this wealth of knowledge and data .The explosion of structured data on the Web, translating natural-language questions into structured queries seems the most intuitive approach. [3] In ―A unified framework for approximate dictionary-based entity extraction ― Zhiguo ong, Dictionary-based entity extraction identifies predefined entities from documents. A recent trend for improving extraction recall is to support approximate entity extraction, which finds all substrings from documents that approximately match entities in a given dictionary [4] In ―Evaluating question answering over linked data‖. The availability of large amounts of open, distributed and structured semantic data on the web has no precedent in the history of computer science. In recent years, there have been important advances in semantic search and question answering over RDF data. The importance‘s of interfaces that bridge the gap between the end user and Semantic Web data have been widely recognized. [5] In ―Question answering on freebase via relation extraction and textual evidence‖ Existing knowledge-based question answering systems often rely on small annotated training data. While shallow methods like relation extraction are robust to data scarcity, they are less expressive than the deep meaning representation methods like semantic parsing, thereby failing at answering questions involving multiple constraints.

ISSN:0975-887

3 .GAP ANALYSIS [1] Knowledge-based question answering as machine translation: A typical knowledge-based question answering (KB-QA) system faces two challenges: one is to transform natural language questions into their meaning representations (MRs); the other is to retrieve answers from knowledge bases (KBs) using generated MRs. Remarks System faces challenges: To transform natural language questions into their meaning representations (MRs); [2] Robust question answering over the web of linked data: Knowledge bases and the Web of Linked Data have become important assets for search, recommendation, and analytics. Natural-language questions are a userfriendly mode of tapping this wealth of knowledge and data. Remarks However question answering technology does not support work robustly in this setting as questions have to be translated into structured queries and users have to be careful in phrasing their questions. [3] A unified framework for approximate dictionary-based entity extraction: Zhiguo ong, Dictionary-based entity extraction identifies predefined entities from documents. A recent trend for improving extraction recall is to support approximate entity extraction, which finds all substrings from documents that approximately match entities in a given dictionary.. Remarks There are no evaluations so far that systematically evaluate this kind of systems, in contrast question answering and search interfaces to document spaces. [4] Mottac-Evaluating question answering over linked data: The availability of large amounts of open, distributed and structured semantic data on the web has no precedent in the history of computer science. In recent years, there

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 281

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

have been important advances in semantic search and question answering over RDF data. In particular, natural language interfaces to online semantic data have the advantage that they can exploit the expressive power of Semantic Web data models and query languages, while at the same time hiding their complexity from the user. Remarks There are no evaluations so far that systematically evaluate this kind of systems, in contrast to traditional question answering and search interfaces to document spaces. [5] Question answering on freebase via relation extraction and textual evidence: Existing knowledge-based question answering systems often rely on small annotated training data. While shallow methods like relation extraction are robust to data scarcity, they are less expressive than the deep meaning representation methods like semantic parsing, thereby failing at answering questions involving multiple constraints. Remarks While shallow methods like relation extraction are robust to data scarcity, they are less expressive than the deep meaning representation methods like semantic parsing, thereby failing at answering questions involving multiple constraints. This paper is very most useful for our system. It is great paper. Deep Learning is used in this Paper. 4. CURRENT SYSTEM The Existing system hardness of RDF Q/A lies in the ambiguity of unstructured natural language question sentences. Generally, there are two main challenges. Phrase Linking: A natural language phrase wsi may have several meanings, i.e., wsi correspond to several semantic items in RDF Graph G. As shown in Figure 1(b), the entity phrase ―Paul Anderson‖ can map to three persons hPaul Anderson (actor)i, hPaul S. Andersoni and hPaul W. S. ISSN:0975-887

Andersoni. For a relation phrase, ―directed by‖ also refers to two possible predicates the directory and h writer. Sometimes a phrase needs to be mapped to a non-atomic structure in knowledge graph. For example, ―uncle of‖ refers to a predicate path (see Table 4). In RDF Q/A systems, We should eliminate ―the ambiguity of phrase linking‖. Composition. The task of composition is to construct corresponding query or query graph by assembling the identified phrases. In the running example, we know the predicate directory is to connect subject hfilmi and object hPaul W. S. Andersoni; consequently, we generate a triple h film, director, Paul W. S. Andersoni. However, in some cases, it is difficult to determine the correct subject and object for a given predicate, or there may exist several possible query graph structures for a given question sentence. We call it ―the ambiguity of query graph structure‖.

5. PROPOSED SYSTEM APPROCH: This system uses framework to answer natural language questions over RDF repository from a graph data-driven technique. A semantic query graph is used to model the query knowledge in the natural language question in a structural way, Resource Description Framework Question and answering is main use reduced to sub graph matching problem. More importantly, the system resolves the ambiguity of natural language questions at the time when matches of query are found. The cost of disambiguation is saved if there is no matching found. The system uses this semantic query information using knowledge graph best answer; because of this it overcome problems occur in realtime application such as Quora and Stack Overflow. 6. SYSTEM ARCHITECTURE

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 282

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

In other words, the system combines the disambiguation and query evaluation in an uniform process. REFERENCES [1]

[2]

[3] Fig.6.1. System Architecture

7. FLOW DIAGRAM

[4]

[5]

[6] Fig.7.1. Flow Diagram

Above diagram shows the actual flow of the system. 8. CONCLUSION In our system, a graph data-driven framework to answer natural language questions over Resource Description Framework graphs. Different from existing work, the ambiguity both of phrases and structure in the question understanding stage is removed. The system pushes down the disambiguation into the query evaluation stage. Based on the query results over Resource Description Framework graphs, we can address the ambiguity issue efficiently.

ISSN:0975-887

[7]

[8]

[9]

[10]

Junwei Bao , Nan Duan, Ming Zhou , Tiejun Zhao-―Knowledge-based question answering as machine translation Baltimore, Maryland, USA, June 23-25 2014. Mohamed Yahya, Klaus Berberich, Shady Elbassuoni†, Gerhard Weikum-―Robust question answering over the web of linked data‖ Dong Deng ,Guoliang Li · Jianhua Feng ,Yi Duan-―A unified framework for approximate dictionary-based entity extraction‖ Received: 12 November 2013 / Revised: 28 April 2014 / Accepted: 11 July 2014 © Springer-Verlag Berlin Heidelberg 2014 Vanessa Lopeza, Christina Ungerb, Philipp Cimianob, Enrico ―Mottac-Evaluating question answering over linked data. ― Kun Xu, Siva Reddy, Yansong Feng, Songfang Huang and Dongyan Zhao―Question answering on freebase via relation extraction and textual evidence ―Berlin, Germany, August 7-12, 2016. W. M. Soon, H. T. Ng, and D. C. Y. Lim, ―A machine learning approach to co reference resolution of noun phrases,‖ Comput. Linguist. vol. 27, no. 4, pp. 521–544, 2001. L. Androutsopoulos, Natural Language Interfaces to Databases – An Introduction, Journal of Natural Language Engineering 1 (1995), 29–81 V . I. Spitkovsky and A. X. Chang, ―A crosslingual dictionary for english wikipedia concepts,‖ in Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, Istanbul, Turkey, May 23-25, 2012, 2012, pp. 3168–3175. C.D.Manning,P.Raghavan,and H.Sch¨utze ,IntroductiontoInformation Retrieval. New York: Cambridge University Press, 2008. N. Nakashole, G. Weikum, and F. M. Suchanek, ―Discovering and exploring relations on the web,‖ PVLDB, vol. 5, no. 12, pp. 1982–1985, 2012.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 283

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

TECHNIQUE FOR MOOD BASED CLASSIFICATION OF MUSIC BY USING C4.5 CLASSIFIER Manisha Rakate1, Nandan More2

1,2

Department of Computer Engineering,TSSM‘s BSCOER,Narhe,Pune, Savitribai Phule Pune University, Pune, 411041, Maharashtra, India. [email protected], [email protected]

ABSTRACT In today‘s rapidly growth in internet, where downloading and purchasing music from websites are growing intensely. As we know that there is a different relation between music and human emotions, we are listening the songs according to our mood. There are number of methods were implemented for selection of music according to mood. Therefore there is need of method which classifies the music by the human mood. In this paper propose the system which classifies the moods of distinct types of music. C4.5 classifier is used for the classification. By testing the classification system on various mood dimensions, we will examine to what extent the linguistic part of music revealed adequate information for assigning a mood category and which aspects of mood can be classified best, based on extracted features only. Keywords— Data mining, mood classification, timbre features, modulation features, SVM. by any human and indeed by some other. 1. INTRODUCTION In the past few years, research in Music Expertise tends to be for a particular genre Information Retrieval has been very of music or for Western music theory, active. Music information retrieval has much of which does not apply to music produced automatic classification methods from other parts of the world. in order of amount of digital music Music information retrieval has been available. Problem arrived is the automatic focus on automatically extracting mood classification of music. It consists of information from musical sources. The system taking the waveform of a musical musical source comes in so many formats piece as a input and outputting text labels. including written score and audio. A It will describe the mood in the music number of machine learning and statistical (happy, sad, etc). It has been demonstrated analysis techniques are applied. The field that audio-based techniques can achieve to of music information retrieval has satisfying results. By using few simple discovered features for predicting genre. mood categories and checking for reliable Determining key and tempo of music then agreements between people, automatic distinguishing instruments and analyzing classification based on audio features gives the similarity of music, transcribing to promising results. Initially psychological studies have shown Music Technology score from audio and finally eliciting musical information from written scores. Group, University at Pompeu Fabra, It is the part of semantic information of songs The fact is music indeed has an resides exclusively in the lyrics. Lyrics can emotional quotient attached with it. It is contain relevant emotional information necessary to know what are the intrinsic which is not included in the audio. factors are present in music or not and Music can be concordant or discordant; which associate it with a particular mood this is known from the physics of wave or emotion. Audio features are propagation, and System which have mathematical functions calculated over the emerged across the world reflect this. Very audio data and describe some unique discordant sounds are perceived negatively aspect of that data. In the last few decades ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 284

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

number of features was developed for the analysis of audio content. Amount of work has been dedicated to the modeling of relationships between music and emotions, including psychology, musicology and music information retrieval. Proposed emotion models are either the categorical approach or the dimensional approach. Categorical approaches represent emotions as a set of categories that are clearly distinct from each other. For an example six basic emotion categories based on human facial expressions of anger, fear, happiness, sadness, disgust and surprise. Another famous categorical approach is Hevner‘s affective checklist, where eight clusters of affective adjectives were discovered and laid out in circle, as shown in Fig. 1. Each cluster includes similar adjectives, and meaning of neighboring clusters varies in a cumulative way until reaching a contrast in the opposite position‖ In this paper we study about the related work done, in section II, the proposed approach modules description, mathematical modeling, algorithm and experimental setup in section III .and at final we provide a conclusion in section IV. 2. LITERATURE REVIEW In this section discuss the literature review in detail about the recommendation system for online social network. Many short-term long-term modulation and timbre features are developed for content-based music classification. There are two operations in modulation analysis are useful for modulation information and it degrades classification performance. To deal with this problem, Ren et al. [1] proposed a two-dimensional representation of acoustic frequency and modulation frequency. It extracts joint acoustic frequency and modulation frequency features. Long-term joint frequency ISSN:0975-887

features like acoustic-modulation spectral contrast/valley (AMSC/AMSV), acoustic modulation spectral flatness measure (AMSFM), and acoustic-modulation spectral crest measure (AMSCM), are after computes from the spectra of each joint frequency sub-band. The prominent status of music in human culture and everyday life is due in large part to its striking ability to elicit emotions. It may have slight variation in mood to changes in our physical condition and actions. M. Barthet et al. [2] describes study of music and emotions from different disciplines including psychology, musicology and music information retrieval. music information retrieval propose new insights to enhance automated music emotion recognition models. C.-H. Lee et al. [3] proposed an automatic music genre classification approach based on long-term modulation spectral analysis of spectral (OSC and MPEG-7 NASE) and cepstral (MFCC) features. Modulation spectral analysis of every will generates a modulation spectrum. All the modulation spectra are collected to form a modulation spectrogram. Which exhibits the timevarying or rhythmic information of music signals Each modulation spectrum is then decomposed into several logarithmicallyspaced modulation sub-bands. The MSC and MSV are then computed from each modulation sub-band. Y. Songet al. [4] proposed a collected a truth data set of 2904 songs, that have been tagged with one of the four words ―happy‖, ―sad‖, ―angry‖ and ―relaxed‖. Audio is then retrieved from 7Digital.com, and by using standard algorithms sets of audio features are extracted. By using support vector machins there are two classifiers are trained. with the polynomial and radial basis function kernels and these are tested with 10-fold cross validation. Results show that spectral

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 285

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

features outperform those based on rhythm, dynamics, and, to a lesser extent, harmony.

underlying machine, Processes different audio and visual low and mid level features.

Y. Panagak et al. [5] proposed the automatic mood classification problem. By resorting the low rank representation of slow auditory spectro-temporal modulations. Recently, If each data class is linearly spanned by a subspace of unknown dimensions and the data are noiseless. The lowest-rank representation of a set of test vector samples with respect to a set of training vector samples has the nature of being both dense for within-class affinities and almost zero for betweenclass affinities. LRR exacts the classification of the data, result is LowRank Representation-based Classification (LRRC). The LRRC is compared against three well-known classifiers, namely the Sparse Representations-based Classifier, SVM and Nearest Neighbor classifiers for music mood classification by conducting experiments on the MTV and the Soundtracks180 datasets.

In paper [8] authors proposed a way in which music can be displayed for the user which is based on similarity of the acoustic features. All songs are in music library onto a 2D feature space. The user can better understand the relationship between the songs, with the distance between each song reflecting its acoustic similarity. Low-level acoustic features are extracted from the raw audio signals and performing dimension reduction using PCA on the feature space. The proposed approach avoids the dependence of contextual data called as metadata and collaborative filtering methods. By using song space visualizer, the user can chose songs or allow the system to automate the song selection process given a seed song.

In paper [6] authors proposed method using cell mixture models to automate the task of music emotion classification Designed system has potential application of both unsupervised and supervised classification learning. This system is acceptable for music mood classification. The ICMM is suitable for the music emotion classification. In paper [7] authors given a technical solution for automated slideshow generation by extracting a set of high-level features from music Like beat grid, mood, genre and intelligently combining this set with image high-level features. For example, the user request the system to automatically create a slideshow, which plays soft music and shows pictures with sunsets from the last 10 years of his own photo collection. The high-level feature extraction the audio and visual information which is based on the same ISSN:0975-887

In paper [9] authors proposed a method, which considers the various kinds of audio features. A bin histogramhas been computed from each feature‘s frame to save all needed data related with it. The histogram bins are used for calculating the similarity matrix. The number of similarity matrices depends on the number of audio features. There are 59 similarity matrixes. To compute the intra-inter similarity ratio, the intra and inter similarity matrix are utilized. These similarity ratios are sorted in descending order in each feature. From this some of the selected similarity ratios are ultimately used as prototypes from each feature. Further used for classification by designing the nearest multi-prototype classifier. In paper [10] authors proposed a selfcolored music mood segmentation and a hierarchical framework based on new mood taxonomy model to automate the task of multi-label music mood classification. The taxonomy model combines Thayer‘s 2D models . Schubert‘s Updated Hevner adjective Model (UHM)

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 286

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

to mitigate the probability of error causing by classifying upon maximally 4 class classification from 9. The verse and chorus parts approximately 50 to 110 sec of the whole songs is exerted manually as input music trims in this system. The extracted feature sets from these segmented music pieces are ready to inject the FSVM for classification. 3. PROPOSED APPROACH Proposed System Overview We implement a feature set for music mood classification, which combine modulation spectral analysis of MFCC, OSC, and SFM/SCM and statistical descriptors of short-term timbre features. By employing these features for SVMs, our submission to the audio mood classification task was ranked #1. In fact, the submission outperformed all the other submissions of the task from 2008 to2014, indicating the superiority of the proposed feature sets. Moreover, based on a part of the aforementioned feature sets, we have also proposed another new feature set that combines the newly proposed joint frequency features (including AMSC/AMSV and AMSFM/AMSCM), together with the modulation spectral analysis of MFCC, and statistical descriptors of short-term timbre features. Experiments conducted on Raga Music Dataset. Explore the possibility of using dimensionality reduction techniques to extract a compact feature set that can achieve equal or better performance

Figure 1.Proposed System Architecture

Mathematical Model For a joint acoustic-modulation spectrogram, we can compute four joint frequency features, namely AMSC, AMSV, AMSFM, and AMSCM, and each of them is a matrix of size AXB. AMSP and AMSV For each joint acoustic modulation frequency sub band, we compute the acoustic-modulation spectral peak (AMSP) and the acoustic-modulation spectral valley (AMSV) as follows:

For simplicity, we can assume Sa,b is a descending sorted vector in which Sa,b [i] is the i-th element of Sa, b, Na, b is the total number of elements in Sa, b, and αis a neighborhood factor identical to that used in computing OSC. AMSC:

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 287

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

The difference between AMSP and AMSV, denoted as AMSC (acousticmodulation spectral contrast), can be used to reflect the spectral contrast over a joint frequency sub band:

Following figure 2 shows the time comparison graph of the proposed system with the existing system. Graph is plot by using the above table.

AMSFM To measure the noisiness and sinusoidality of the modulation spectra, we further define the acoustic modulation spectral flatness measure (AMSFM) as the ratio of the geometric mean to the arithmetic mean of the modulation spectra within a joint frequency sub band:

AMSCM The acoustic modulation spectral crest measure (AMSCM) can be defined as the ratio of the maximum to the arithmetic mean of the modulation spectra within a joint frequency subband,

Fig. 2: Time Graph

In table 2 shows the memory required by the proposed system using C4.5 and existing system using KNN classification. The following table shows that the memory consumed by existing system is more than the memory consumed by the proposed system. Table 2: Memory Comparison for clustering

System Existing system KNN Proposed System C4.5

4. RESULTS AND DISCUSSION A. Expected Result In this section discussed the experimental result of the proposed system. In table 1 shows the time required for the music mood classification for proposed system by using C4.5 and existing system by using KNN. From the table it shows that time required by C4.5 classification is less than the time required by the KNN.

Memory Required With 2500 kb with 1800 kb

Following figure 3 shows the memory comparison graph of the proposed system with the existing system.

Table 1: Time Comparison for Clustering

System Existing system KNN Proposed System C4.5

ISSN:0975-887

Time Required with 1500 ms with 900 ms

Fig 3: Memory Graph

5. CONCLUSION AND FUTURE SCOPE .We found that two operations (which compute the representative feature

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 288

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

spectrogram and the mean and standard deviation of the MSC/MSV matrices) in the modulation spectral analysis of short term timbre features are likely to smooth out useful modulation information, so we propose the use of a joint frequency representation of an entire music clip to extract joint frequency features. These joint frequency features, including acoustic-modulation spectral contrast/valley, acoustic-modulation spectral flatness measure and acousticmodulation spectral crest measure, outperform the modulation spectral analysis of OSC and SFM/SCM in Raga Music datasets by small margins. The advantage of the proposed features is that they can have a better discriminative power due their operation on the entire music, with no averaging over the local modulation features. Extracted features are used for classification of test music files according to the mood softest files. For classification C4.5 classifier is used. System can be enhanced with mood classification in music videos. We will also apply these features to multi-label tasks such as auto tagging and tag-based retrieval. REFERENCES [1] Ren, Jia-Min, Ming-Ju Wu, and Jyh-Shing

[2]

Roger Jang. "Automatic music mood classification based on timbre and modulation features." IEEE Transactions on Affective Computing 6.3 (2015): 236-246. M. Barthet, G. Fazekas, and M. Sandler, ―Multidisciplinary perspectives on music emotion recognition: recommendations for content- and context-based models.‖ Proc. CMMR, pp. 492–507, 2012.

ISSN:0975-887

[3] C.-H. Lee, J.-L. Shih, K.-M. Yu, and H.-S.

Lin, ―Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features.‖ IEEE Transactions on Multimedia, vol. 11, no. 4, pp. 670–682, 2009. [4] Y. Song, S. Dixon, and M. Pearce, ―Evaluation of musical features for emotion classification,‖ in Proceedings of the 13th International Society for Music Information Retrieval Conference, Porto, Portugal, October 8-12 2012, pp. 523–528. [5] Y. Panagakis and C. Kotropoulos, ―Automatic music mood classification via low-rank representation,‖ in Proc, 2011, pp. 689–693, 2010. [6] X. Sun and Y. Tang, "Automatic Music Emotion Classification Using a New Classification Algorithm," Computational Intelligence and Design, 2009. ISCID '09. Second International Symposium on, Changsha, 2009, pp. 540-542. [7] P. Dunker, C. Dittmar, A. Begau, S. Nowak and M. Gruhne, "Semantic High-Level Features for Automated Cross-Modal Slideshow Generation," 2009 Seventh International Workshop on Content-Based Multimedia Indexing, Chania, 2009, pp. 144149. [8] M. S. Y. Aw, C. S. Lim and A. W. H. Khong, "SmartDJ: An interactive music player for music discovery by similarity comparison," Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific, Kaohsiung, 2013, pp. 1-5. [9] B. K. Baniya, ChoongSeon Hong and J. Lee, "Nearest multi-prototype based music mood classification," Computer and Information Science (ICIS), 2015 IEEE/ACIS 14th International Conference on, Las Vegas, NV, 2015, pp. 303-306. [10] E. E. P. Myint and M. Pwint, "An approach for mulit-label music mood classification," Signal Processing Systems (ICSPS), 2010 2nd International Conference on, Dalian, 2010, pp. V1-290-V1-294.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 289

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SUPER MARKET ASSISTANT WITH MARKET BASKET AND INVENTORY ANALYTICS Aditya Kiran Potdar1, Atharv Subhash Chitre2, Manisha Dhalaram Jongra3, Prasad Vijay Kudale4, Prema S. Desai5 1,2,3,4,5

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT The goal of the system is to help know the user whether a product he intends to buy is available within a particular shop by using a certain set of algorithms which will assist in letting him know the details of product such as price, quantity etc. and would let him know which are the items frequently bought along with the product which the user wants. The system would also assist the wholesaler to know the demand of the super market using inventory analysis and forecasting KeywordsOCR, Apriori, FP, AI, Customer, Buyer, Wholesaler 1. INTRODUCTION In today‘s modern world Artificial 1.2 Objective Intelligence is a boom, it‘s a head start for To eradicate the day to day problems people who are searching to have a of buying products by having an development in this field. For AI to work Artificially intelligent device that helps on developing substance machine learning solve the customers to keep knowing that can be a glory towards it. By using their right product is available in that shop machine learning algorithms brings easy or not. This not only helps to know the access towards the development. An commodities are present or not but also it application can be made by such has the factor to detect how much value of combinations which brings a growth in the product. this section. A modern application is being made by using technology of AI as well as machine learning which shows the modern 2. MOTIVATION problems of waiting in the queue for The actual motivation is to help the shopping. This application helps customer person find the right product in his or her to just feed his way of subjects without choice and it helps to know the right cost standing for the queue and checks the estimation of the product. It helps to find product which is available or not. the availability of the product before entering the market as the person has the right intelligent device. This solves the 1.1 Problem Definition To develop an android application problem of the people to know that they based on Machine Learning Image are purchasing right number of recognition which would help customer to commodities to their usage of day to day check whether a product is available in a activities. particular shop by scanning the shops name plate and proceed to buy that 3. STATE OF ART product, depending upon the availability of According to the current market that product. For the dealers also, who scenario, there are not many applications provide goods for the super market they available that searches the product which can scan the necessary requirements and customer wants to buy, from standing then the dealer can get demand of their outside the shop. Currently no such resources very easily. realistic application has paved its way into ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 290

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

the market. Such situations tend to create problems into the buyer‘s head. Every person in any country would have the tendency to shop online from outside the big shops as it would not waste his/her time. So, there is lack in this type of field. When just a formal inspection was done that how the actual operation could be carried out, a lot of meaningful ideas has flown through the mind that how can the user benefit from this. A rough estimation has that people avoid stepping in shops after seeing large traffic outside. So, by having all these calculations and point of views, this project tries to curb all these challenges which come into the mind of people very easily. To classify these scenarios a developing application is done using machine learning algorithms. By this means a person can check that the product which he tends to buy is available in that particular shop or not. He can further classify that the rate is too high or which company brand he wants, which is totally dependent on the choice of the customer. 4. GAP ANALYSIS Apriori Algorithm:  It is an array-based algorithm.  It uses Join and Prune technique.  Apriori uses breadth first search.  Apriori utilizes a level wise approach where it generates patterns containing 1 item, then 2 items, then 3 items, and so on.  Candidate generation is extremely slow. Runtime increases exponentially depending on the number of different items.  Candidate generation is very parallelizable.  It requires large memory space due to large number of candidate generation.  It scans the database multiple times for generating candidate sets. FP Growth Algorithm:  It is a tree-based algorithm. ISSN:0975-887



 

   

It constructs conditional frequent pattern tree and conditional pattern base from database which satisfy minimum support. FP Growth uses a depth first search. FP Growth utilizes a patterngrowth approach means that, it only considers patterns actually existing in the database. Runtime increases linearly, depending on the number of transactions and items. Data are very interdependent; each node needs the root. It requires less memory space due to compact structure and no candidate generation. It scans the database only twice for constructing frequent pattern tree.

5. PROPOSED WORK Proposed System:

Figure 1:-System Architecture

OCR: OCR (optical character recognition) is the recognition of printed or written text characters by a computer. This involves photo scanning of the text character-bycharacter, analysis of the scannedin image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 291

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

FP Growth Algorithm: The FP-Growth Algorithm is an alternative way to find frequent itemset without using candidate generations, thus improving performance. For so much it uses a divide-and-conquer strategy. The core of this method is the usage of a special data structure named frequent-pattern tree (FP-tree), which retains the item set association information. FP-Tree structure: The frequent-pattern tree (FP-tree) is a compact structure that stores quantitative information about frequent patterns in a database Algorithm 1: FP-tree construction: Input: A transaction database DB and a minimum support threshold? Output: FP-tree, the frequent-pattern tree of DB. Method: The FP-tree is constructed as follows. 1. Scan the transaction database DB once. Collect F, the set of frequent items, and the support of each frequent item. Sort F in supportdescending order as FList, the list of frequent items. 2. Create the root of an FP-tree, T, and label it as ―null‖. For each transaction Trans in DB do the following:  Select the frequent items in Trans and sort them according to the order of FList. Let the sorted frequent-item list in Trans be [ p | P], where p is the first element and P is the remaining list. Call insert tree([ p | P], T).  The function insert tree([ p | P], T) is performed as follows. If T has a child N such that N.itemname = p.item-name, then increment N ‘s count by 1; else create a new node N, with its ISSN:0975-887

count initialized to 1, its parent link linked to T, and its nodelink linked to the nodes with the same item-name via the node-link structure. If P is nonempty, call insert tree(P, N ) recursively. By using this algorithm, the FP-tree is constructed in two scans of the database. The first scan collects and sort the set of frequent items, and the second constructs the FP-Tree. FP-Growth Algorithm: -

After constructing the FP-Tree it‘s possible to mine it to find the complete set of frequent patterns. To accomplish this job, Han in [1] presents a group of lemmas and properties, and thereafter describes the FP-Growth Algorithm as presented below in Algorithm 2. Algorithm 2: FP-Growth: Input: A database DB, represented by FPtree constructed according to Algorithm 1, and a minimum support threshold? Output: The complete set of frequent patterns. Method: call FP-growth(FP-tree, null). Procedure FP-growth(Tree, a) { if Tree contains a single prefix path then { // Mining single prefix-path FPtree let P be the single prefixpath part of Tree; let Q be the multipath part with the top branching node replaced by a null root; for each combination (denoted as ß) of the nodes in the path P do

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 292

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

generate pattern ß ∪ a with support = minimum support of nodes in ß; let freq pattern set(P) be the set of patterns so generated; } else let Q be Tree; for each item ai in Q do { // Mining multipath FP-tree Generate pattern ß = ai ∪ a with support = ai .support; construct ß‘s conditional pattern-base and then ß‘s conditional FP-tree Tree ß; if Tree ß ≠ Ø then call FP-growth(Tree ß , ß); let freq pattern set(Q) be the set of patterns so generated; } return(freq pattern set(P) ∪ freq pattern set(Q) ∪ (freq pattern set(P) × freq pattern set(Q))) } When the FP-tree contains a single prefixpath, the complete set of frequent patterns can be generated in three parts: the single prefix-path P, the multipath Q, and their combinations (lines 01 to 03 and 14). The resulting patterns for a single prefix path are the enumerations of its sub paths that have the minimum support (lines 04 to 06). Thereafter, the multipath Q is defined (line 03 or 07) and the resulting patterns from it are processed (lines 08 to 13). Finally, in line 14 the combined results are returned as the frequent patterns found. Light Gradient Boosting Machine:  Light GBM is a gradient boosting framework that uses tree-based learning algorithm.  The tree in light gbm grows vertically as compared to other algorithms which grows horizontally.

ISSN:0975-887

 Light gbm is used to increase the accuracy of prediction.  Light gbm supports parallel computing and GPU learning.  Training speed is faster as well as efficient. 6. CONCLUSION AND FUTURE WORK With the help of this application we can help the customer to find the exact commodity which he intends to check in the early stages of its cycle. He can then check on many products as he wishes. The algorithms implemented helps to make tasks easy for the developer as he can make on updates when a new stage of implementation flashes his mind We have also discussed that how this app could help shopkeepers to improve their production. The big shopping queue which is created in front of stores can be reduced to an extent. This can help to save more time by not standing in queues and our country will see a tremendous growth by this project. We are hoping to build upon a payment gateway system to the application so as it will be convenient for the user to buy the commodities. REFERENCES [1] Mindpapers Bibliography on the Philosophy of AI (Compiled by David Chalmers) People with Online Papers in Philosophy of AI (Compiled by David Chalmers) [2] Philosophy and Artificial Intelligence (Association for the Advancement of Artificial Intelligence) From Russell and Norvig [3] Copeland, Jack Artificial Intelligence: A Philosophical Introduction Blackwell 1993 An excellent and engaging discussion of the philosophical issues surrounding AI. [4] The role of Apriori algorithm for finding the association rules in Data mining [5] Trending topic prediction by optimizing Knearest neighbor algorithm (Jugendra Dongre; Gend Lai Prajapati; S. V. Tokekar) [6] https://cloud.google.com/vision/docs/ocr (Syafruddin Syarif; Anwar; Dewiani)

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 293

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[7] An Implementation of FP-growth Algorithm (ACM, 2005) [8] An Efficient Frequent Patterns Mining Algorithm Based on Apriori Algorithm and the FP-Tree Structure (IEEE, 2008) [9] An empirical analysis and comparison of Apriori and FP- growth algorithm for frequent pattern mining (IEEE, 2015) [10] OCR Engine to Extract Food-Items, Prices, Quantity, Units from Receipt Images, Heuristics Rules Based Approach (2017)

ISSN:0975-887

[11] Improving OCR performance with background image elimination (2015) [12] Android Based Home Security Systems Using Internet of Things (IoT) and Firebase (2018) [13] Optical Character Recognition (OCR) Performance in Server-Based Mobile Environment (2013) [14] OCR++: A Robust Framework for Information Extraction from Scholarly Articles (2016)

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 294

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

ANALYSIS AND PREDICTION OF ENVIRONMENT NEAR A PUBLIC PLACE Bhagyesh Pandey1, Rahul Bhati2, Ajay Kuchanur3, Darshan Jain4, S.P. Kosbatwar5 1,2,3,4

Student, Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. 5 Assistant Professor, Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT Background/Objectives: To forecast weather, this is one of the greatest challenges in the meteorological department. Weather prediction is necessary so as to inform people and prepare them in advance about the current and upcoming weather condition. This helps in reduction in loss of human life and loss of resources and minimizing the mitigation steps that are expected to be taken after a natural disaster occurs. Methods/Statistical analysis: This study makes mention of various techniques and algorithms that are likely to be chosen for weather prediction and highlights the performance analysis of these algorithms. Various other ensemble techniques are also discussed that are used to boost the performance of the application. Findings: After a comparison between the data mining algorithms and the corresponding ensemble technique used to boost the performance, a classifier is obtained that will be further used to predict the weather. Applications: Used to Predict and forecast the weather condition of a specific region based on the available prehistorical data which helps to save resources and prepare for the changes forthcoming years. The user can fix this system anywhere without memorizing its location. Location is continuously updated on the Android app. Keywords: - Data Mining, Decision Tree, Pre-Processing, Weather Prediction this, we are analyzing the data of 1. INTRODUCTION The environment monitoring system is a temperature, air quality, and sound level to system that is capable of measuring avoid spreading of diseases. A healthy several environmental parameters like person should not get affected hence we temperature, humidity, pressure, are using swachh collector the proposed illumination, and quantity of gasses like IOT device. This device will installed in LPG etc. These parameters are important hospital premises and sensor of swachh in many applications like in industry, collector we will get data about the present smart homes Greenhouse and weather environment and that data will be analyzed forecasting. Advanced Environment through algorithm and important monitoring systems offer many features information is extracted in the required like remote access to the measurement format, the extracted information is data and also can initiate some control mapped with IMD database for data action from a distant location. These accuracy. systems use Wireless Sensor Networks for sensing the environment parameters. 2. LITERATURE SURVEY Wireless Sensor Network (WSN) has By using embedded intelligence into the sensors to sense the physical parameters environment makes the environment and they are interconnected wirelessly to interactive with other objectives, this is exchange information. They have a central one of the application that smart monitoring system that is connected to the environment targets. Human needs internet to access the data remotely. In demand different types of monitoring ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 295

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

systems these depend on the type of data gathered by the sensor devices. Event Detection based and Spatial Process Estimation are the two categories to which applications are classified. Initially, the sensor devices are deployed in the environment to detect the parameters (e.g., Temperature, Humidity, Pressure, LDR, noise, CO and radiation levels etc.) while the data acquisition, computation and controlling action (e.g., the variations in the noise and CO levels with respect to the specified levels). Sensor devices are placed at different locations to collect the data to predict the behavior of a particular area of interest. The main aim of this paper is to design and implement an effective monitoring system through which the required parameters are monitored remotely using the internet and the data gathered from the sensors are stored in the cloud and to project the estimated trend on the web browser[1]. With the progression of advancements in technology, several innovations have been made in the field of communications that are transiting to the Internet of Things. In this domain, Wireless Sensor Networks (WSN) is one of those independent sensing devices to monitor physical and environmental conditions along with thousands of applications in other fields. As air pollution is a major environmental change that causes many hazardous effects on human beings that need to be controlled. Hence, we deployed WSN nodes for constant monitoring of air pollution around the city and the moving public transport buses and cars. This methodology gave us the monitoring data from the stationary nodes deployed in the city to the mobile nodes on Public Transport buses and cars. The data of the air pollution particles such as gases, smoke, and other pollutants are collected via sensors on the Public transport buses and the data is being analyzed when the buses and cars reach back to the source destination after passing through the stationary nodes around the city. Our ISSN:0975-887

proposed architecture having innovative mesh network will be a more efficient way of gathering data from the nodes of WSN. It will have lots of benefits with respect to the future concept of Smart Cities that will have the new technologies related to the Internet of Things. [2] Temperature and relative humidity play an important role in the lifecycle of the plants. When plants have the right humidity they thrive because they open their pores completely and so breathe deeply without the threat of excessive water loss. Wireless sensor network (WSN) has revolutionized the field of monitoring and remote sensing. Wireless sensor network or wireless sensor & actuator network (WSAN) are spatially distributed sensors to monitor physical or environmental conditions such as temperature, humidity, fire etc. and to cooperatively pass their data through the network to the main location. The aim of this paper is to design and develop a system which fulfills all above requirements. In this paper, digital humidity temperature composite (DHT11) sensor is used to sense the environmental temperature and Relative Humidity. Arduino microcontroller is used to make the complex computation of the parameters and then to transmit the data wirelessly by using ZigBee S2 module to the receiver. At receiver section, ZigBee S2 module is used to capture the serial data, which is transmitted, by the transmitter and using Digi's XCTU software the data is logged onto PC. [3] This paper uses the ZigBee CC2530 development platform applied to various types of sensors developed for environmental monitoring systems to enhance multi-Sensor wireless signals aggregation via multi-bit decision fusion. ZigBee is a short-range wireless transmission standard IEEE 802.15.4based, formulated by the ZigBee Alliance ZigBee protocol. It is low cost, low power consumption, and short-distance transmission at a transmission rate of 250k bps for wireless sensor networks. Its main

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 296

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

applications include temperature, humidity and other types of data monitoring, factory automation, home automation, remote monitoring, and home device control.[4] The concern of better quality agricultural products from the consumers made the farmers adapt to the latest agricultural techniques by implementing modern technologies for producing better agricultural products. Among the important things which are taken into consideration by the farmers are the qualities of agricultural land, weather conditions etc. Traditional farming involves human labor. With proper data, the farmer will be able to deliver the quality product to the consumer. In this paper, we have discussed monitoring of agriculture parameter using soil moisture level sensor, Wireless technology. We update the parameter result from the sensor node data is transferred to the wireless transceiver to another end server PC. From the PC, then after that values are analyzed and some predicate is applied to it. If they give a positive response then there will continuous monitoring but if it shows negative then it will provide a total farming solution and cultivation plan. It also sends these all solution to farmers or user via SMS to them in their regional languages [5]. The environment monitoring system, in general, is used to monitor various environmental parameters with the help of the sensor. Some communication media, like Wireless Communication, is needed to transfer sensor data. An environment parameter can be temperature, pressure, humidity, GPS location, or an Image. We can design a system to monitor all or any of these parameters as and when required. For monitoring purpose, we need to install some sensors on each node. A node will interact with the sensor and will transfer that data to the controlling unit. A controller will receive data from each node and can take action depending on programming done. The user can use Graphical User Interface (GUI) to manage ISSN:0975-887

all activities or to check data at any time. GUI can be designed using Python, HTML, CSS or any other language. Depending on sensor types, various monitoring services can be designed. To monitor and control services or action we can use the Internet. Data acquired by sensors can be transferred over the network by using a web server or by using some SMS service. To provide energy, the battery cell can be used [6]. Wireless sensor networks have been a big promise during the last few years, but a lack of real applications makes difficult the establishment of this technology. This paper reviews the wireless sensor network applications which focus mainly on the environmental monitoring system. These systems have low power consumption, low cost and are a convenient way to control real-time monitoring. Moreover, it can also be applied to indoor living monitoring, greenhouse monitoring, climate monitoring, and forest monitoring. These approaches have been proved to be an alternative way to replace the conventional method that uses men force to monitor the environment and improves the performance, robustness, and provides efficiency in the monitoring system. Monitoring the museum's environment for preventive conservation of art purposes is one major concern to all museums. In order to properly conserve the artwork, it is critical to continuously measure some parameters, such as temperature, relative humidity, light and, also, pollutants, either in storage or exhibition rooms. The deployment of a Wireless Sensor Network in a museum can help to implement these measurements in real-time, continuously, and in a much easier and cheap way. In this paper, we present the first testbed deployed in a Contemporary Art Museum, located in Madeira Island, Portugal, and the preliminary results of these experiments. On the other hand, we propose a new wireless sensor node that offers some advantages when compared with several commercially available

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 297

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

solutions. Furthermore, we present a system that automatically controls the dehumidifying devices, maintaining the humidity at more constant levels. A smart environment can be defined as sensorenabled and networked devices that work continuously and collaboratively to make lives of inhabitants more comfortable. In this paper, we discuss a unified signaling/sensing and communication platform known as Wireless Sensor Network (WSN) for smart environment monitoring. WSN is one of the fastest emerging technologies that can be utilized for monitoring our cities and living environments. The proposed paradigm can set a platform to continuously monitor the levels of large quantities of pollutants and environmental parameters in both land and sea in a more efficient smarter way. The paper proposes a framework concerned with protecting and improving the environmental quality using WSN. Among the issues that the paper elaborate on are the types of sensors, sensor power systems, data communication, networking standards, and decision capabilities. In the course of earth evolution, there has been significant development related to human race. However, right from the Stone Age to the mobile age, the development is with respect to human beings only, his progress for making a comfortable life. Technology can help the animals and plants for their identification, monitoring and studying their behavior pattern. Use of technology for Wildlife monitoring is a boon provided by the advances in the research; however extensive use of it may prove as a hindrance to the animal behavior. The data gathered by Wildlife monitoring can be used for a number of purposes viz visualization, analysis, interpretation, prediction etc using various algorithms and tools. The paper is designed to study the role of Information technology and study various tools and strategies for their efficient Habitat monitoring. 3. PROPOSED SYSTEM ISSN:0975-887

Sensors: Sensors are used to gather the information from the surrounding and send it to the Raspberry Pi for processing and gathering of data. Raspberry Pi: In Raspberry Pi, the information which is gathered from the sensor is used for processing and storage of data Android App: The Processed Data from the Raspberry Pi is sent to the Cloud. From the cloud, the data can be accessed by the clients using Android App Data from CO sensor, temperature and humidity sensor is collected by Raspberry which describes CO level, temperature, and humidity of that place. This data is stored in a MYSQL database. The user can access this data through the Android app. The user can place this system at any place; GPS will update location on Android App.

Fig 1 :Architecture of the proposed system

Classification in data mining differentiates the parameters to view the clear information. We will be using Decision tree and k-means clustering algorithm in our project. Decision tree and k-means clustering algorithm seem to be good at predicting the weather with higher accuracy than the other techniques of data mining. K-means clustering and decision tree building process are implementations that the stored data about past measures can be used for the future ones. 4. CONCLUSION The Internet of Things concept arises from the need to manage, automate, and explore all devices, instruments, and sensors in the world. In order to make wise decisions both for people and for the things in IoT,

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 298

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

data mining technologies are integrated with IoT technologies for decision making support and system optimization. Data mining involves discovering novel, interesting, and potentially useful patterns from data and applying algorithms to the extraction of hidden information. In this paper, we survey the data mining in 3 different views: knowledge view, technique view, an application view. In the knowledge view, we review classification, clustering, association analysis, time series analysis, and outlier analysis. In the application view, we review the typical data mining application, including ecommerce, industry, health care, and public service. The technique view is discussed with knowledge view and application view. Nowadays, big data is a hot topic for data mining and IoT; we also discuss the new characteristics of big data and analyze the challenges in data extracting, data mining algorithms, and

ISSN:0975-887

data mining system area. Based on the survey of the current research, a suggested big data mining system is proposed. REFERENCES [1] Edward N. Lorenz ―Dynamical And Empirical Methods Of Weather Forecasting‖ Massachusetts Institute Of Technology.pp.423-429,2014 [2] Mathur, S., and A. Paras. "Simple weather forecasting model using mathematical regression." Indian Res J Exten Educ: Special1 (2012). [3] Monika Sharma, Lini Mathew, Chatterji s. "Weather Forecasting using Soft Computing and Statistical Techniques". IJAREEIE. Vol.3, Issue 7,pp.122-131 [4] Sohn T., Lee J.H., Lee S.H., and Ryu, "Statistical prediction of heavy rain in South Korea" Advances in Atmospheric Sciences, Vol. 22, 2015. pp.365-372 [5] Kannan, M. Prabhakaran S. and Ramachandran, P. ―Rainfall forecasting using data mining technique‖. International Journal of Engineering and Technology, Vol. 2, No. 6, pp. 397-401, 2014.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 299

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SECURE CLOUD LOG FOR CYBER FORENSICS Dr V.V.Kimbahune1, Punam Shivaji Chavan 2, Priyanka Uttam Linge3, Pawan Bhutani3 1,2,3

Department of Computer Engineering, Smt Kashibai Navale College of Engineering, Vadgaon(Bk0, Pune, India. 6 Department of ECE, HMR Institute of Technology and Managmenet Delhi [email protected], [email protected], [email protected], [email protected]

ABSTRACT The widespread use of online social networks (OSNs) to disseminate information and exchange opinion, by the public, media and political as well as actors, has enabled some avenues of research in political science. In this paper, we study the problem of quantifying political leaning of users. We formulate political leaning inference as a convex optimization problem that incorporates two ideas where users are tweeting and retweeting about political issues, and other similar users tend to be retweeted by similar audiences. We then apply our inference technique to election-related tweets collected in eight months during the 2012. On a dataset of frequently retweeted tweets, our technique achieves 90% to 94% accuracy. By studying the political leaning of 1,000 frequently retweeted sources, 232,000 ordinary users who retweeted them, and the hashtags used by these sources, our quantitative study sheds light on the political demographics of the Twitter population, and the temporal dynamics of political polarization as events unfold. Keywords- Twitter , Tweet , Retweet, Dataset social graph constrains the propagation of 1. INTRODUCTION One of the largest social networks information. This problem is important with more than 500 million registered because its answer will unveil the accounts is the Twitter . However, it highways used by the owns of information. differs from other large social networks, To achieve this goal, we need to overcome such as Facebook and Google+, because it two challenges. First, we need an up-to uses exclusively arcs among date and complete social graph. The most accounts1.Therefore, the way information recent publicly available Twitter datasets propagates on Twitter is close to how are from 2009, at that time Twitter was 10 information propagates in real life. Indeed, times smaller than in July 2012. Morereal life communications are characterized over, these datasets are not exhaustive, by a high asymmetry between information thus some subtle properties may not be producers (such as media, celebrities, etc.) visible. Second, we need a methodology and content consumers. Consequently, revealing the underlying social understanding how information propagates relationships among users, a methodology on Twitter has implications beyond that scales for hundreds of millions of computer science. However, studying accounts and tens of billions of arcs. information propagation on a large social Standard aggregate graph metrics such as network is a complex task. Indeed, degree distribution are of no help because information propagation is a combination we need to identify the highways of the of two phenomena. First, the content of the graph followed by messages. Our study messages sent on the social network will has a number of implications. (a) From a determine its chance to be relayed. modeling perspective, we see evidence Second, the structure of the social graph that tweeting and retweeting are indeed will constrain the propagation of consistent, and this observation can be messages. In this paper, we specifically applied to develop new models and focus on how the structure of the Twitter algorithms. (b) From an application ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 300

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

perspective, besides election prediction, our method can be applied for other purposes, such as building an automated tweet aggregator that samples tweets from opposite sides of the political spectrum to provide users with a balanced view of

controversial issues in the Twittersphere. Therefore, we need a methodology to both reduce the social graph and keep its main structure. 2.

LITERATURE SURVE

Table: Literature Survey

Sr. No 1

Paper Name

Dataset

2

Birds of the Same Feather Tweet Together. Bayesian Ideal Point Estimation Using Twitter Data

3

What‘s in Your Tweets? I Know Who You Supported in the UK 2010 General Election

4

Political Tendency Identification in Twitter using Sentiment Analysis Techniques

Year

Studying Social Collected data of Twitter Networks at after 2009 Scale:Macroscopic Anatomy of the Twitter Social Graph

Focus on six countries 2012 where high-quality ideology measures are available for a subset of all Twitter users: the US, the UK, Spain, Germany, Italy, and the Netherlands Dataset formed of 2016 collected messages from Twitter related to the 2010 UK general election which took place on May 6th, 2010. Collected data of Twitter 2014 after 2013

3. PROBLEM STATEMENT Using data mining we are going to collect all the data from the system and are going to categorized them according to various fields like sports, bollywood, politics etc. The collected data from system will be in raw data from that raw data we will get all the information of fields like political. When someone is posting some of data and if the data posted is vulgar then that ISSN:0975-887

2014

Technology Used Crawling Methodology( Used Twitter REST API to crawl the data from Twitter ) Twitter API they obtained the entire list of followers

Bayesian Classification

SVM based algorithm

post will be deleted automatically by checking the dataset. Later according to the positive and negative comments we will get the feedback count through which graph will be generated. According to the graph been generated a voter can easily decide to whom they have to vote to choose the best government. 4.

PROJECT SCOPE

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 301

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Now a day's twitter is a popular social media, where everyone is having a account like Bollywood, Sportsperson, Political Party. Now we are just focusing on political account, like that we can Predicts as well as match score according to people review on the pitch or movies box office collection prediction and so on. This review or graphs will be helpful for peoples to choose the decisions at the time of elections. 5. SYSTEM ARCHITECTURE

Figure: System Architecture

6.

PROPOSED APPROCH To motivate our approach based on retweets, we consider a small example based on some data extracted from our dataset on the presidential election. Consider a proRepublican media source A and a pro- Democrat media source B. We observe the number of retweets they received during two consecutive events. During the ―Romney 47 percent comment‖ event1 (event 6 in Table 1), source A received 791 retweets, while source B received a significantly higher number of 2,311 retweets. It is not difficult to imagine what happened: source B published tweets bashing the Republican candidate, and Democrat supporters enthusiastically retweeted them. Then consider the first presidential debate. It is generally viewed as an event where Romney outperformed Obama. This time source A received 3,393 retweets, while ISSN:0975-887

source B received only 660 retweets. The situation reversed with Republicans enthusiastically retweeting. This example provides two hints: (a) The number of retweets received by a tweeter (the two media sources) during an event can be a signal of its political leaning. In particular, one would expect a politically inclined tweeter to receive more retweets during an event favorable to the candidate it supports. (b) The action of retweeting carries implicit sentiment of the retweeter. This is true even if the original tweet does not carry any sentiment itself. 7. CONCLUSION Motivated by the election prediction problem, we study in this paper the problem of quantifying the political leaning of prominent members in the Twitter sphere. By taking a new point of view on the consistency relationship between tweeting and re-tweeting behavior, we formulate political leaning quantification as an ill-posed linear inverse problem solved with regularization techniques. The result is an automated method that is simple, efficient and has an intuitive interpretation of the computed scores. Compared to existing manual and Twitter network-based approaches, our approach is able to operate at much faster timescales, and does not require explicit knowledge of the Twitter network, which is difficult to obtain in practice. ACKNOWLEDGMENTS The volume of the work would not have been possible without contribution in one form or the other by few names to mention. We welcome this opportunity to express our heartfelt gratitude and regards to our project guide Prof. V.V. Kimbahune Department of Computer Engineering, STES SMT. KASHIBAI NAVALE COLLEGE OF ENGINEERING, for his unconditional guidance. She always bestowed parental care upon us and evinced keen interest in solving our problems.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 302

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[6] A. Boutet, H. Kim, and E. Yoneki, Whats in REFERNCES [1] Felix Ming Fai Wong, Member, IEEE, Chee

[2]

[3]

[4]

[5]

Wei Tan, Senior Member, IEEE, Soumya.Sen , Senior Member, IEEE, Mung Chiang, Fellow, IEEE, Quantifying Political Leaning from Tweets, Retweets, and Retweeters, IEEE Transactions on Knowledge and Data Engineering. M.Gabjelkov, A.Rao and A.Legout, Studing social networks at scale:Macroscopic anatomy of the Twitter social graph in Proc. SIGMETRICS, 2014. P. Barbera, Birds of the same feather tweet together: Bayesian ideal point estimation using Twitter data Political Analysis, 2014.Fröhlich, B. and Plate, J. 2000. The cubic mouse: a new device for threedimensional input. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems F. Al Zamal, W. Liu, and D. Ruths, Homophily and latent attribute inference: Inferring latent attributes of Twitter users from neighbors in Proc. ICWSM, 2012. Tavel, P. 2007 Modeling and Simulation Design. AK Peters Ltd.

ISSN:0975-887

your tweets? I know who you supported in the UK 2010 general election Proc. ICWSM, 2012. [7] Adamic, L. A., and Glance, N. 2005. The political blogosphere and the 2004 U.S. election: Divided they blog. In Proc. LinkKDD. [8] Ansolabehere, S.; Lessem, R.; and Snyder, J. M. 2006. The orientation of newspaper endorsements in U.S. elections. Quarterly Journal of Political Science 1(4):393–404. [9] Rishitha Reddy,A.sri lakshmi, J.Deepthi , ―Quantifying Political Leaning from Tweets, Retweets, and Retweeters‖, International Journal of Computational Science, Mathematics and Engineering Volume-4Issue-2-February-2017. [10] M. Thelwall, K. Buckley, G. Paltoglou, D. Cai, and K. A., ―Sentiment strength detection in short informal text,‖ Journal of the American Society for Information Science and Technology, vol. 61, no. 12, pp. 2544–2558, 2010.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 303

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

TRAFFIC FLOW PREDICTION WITH BIG DATA Nitika Vernekar1, Shivani Naik2,Ankita More3, Dr V V Kimbahune4, Pawan Bhutani5 1,2,3,4,

Department of Computer Engineering, Smt Kashibai Navale College Of Engineering, Vadgaon(Bk), Pune, India. 5 Department of ECE, HMR Institute of Technology and Managmenet Delhi [email protected], [email protected]

ABSTRACT The traffic flow in metropolitan city is most popular issue in current days. Importance of finding such solution derives from the current problems faced by the urban road traffic, such as congestions, pollution, security issues .To solve existing problem ,to developed new proposed system,in that can collect the raw data of traffic flow of different areas in metropolitan city. After collecting, analyzing, predict how much traffic increase in next few days or year and how to control them. Based on defining and classifying the large special events, this system analyzes the passenger flow distribution characteristics of large special events, studies the spatial and temporal distribution of road traffic flow surrounding the event areas. The system designs common process of traffic organization and management for different large special events, proposes the static and dynamic traffic organization methods and management strategies, and designs the operation steps, which provide a reference and guidance for the traffic organization practice of large special events. Keywords- Intelligent transportation, Traffic, Prediction Joe Lemieux et al. [1] proposed worldwide 1. INTRODUCTION Exact and convenient traffic stream data is improvement of the vitality utilization of at present emphatically required for double power source vehicles, for individual explorers, business divisions, example, crossover electric vehicles, and government offices. In metropolitan module half breed electric vehicles, and city traffic flow is more as compare to fitting in energy component electric other metro city as well as other urban city vehicles requires learning of the total so,traffic flow most popular issue in course attributes toward the start of the current days. Importance of finding such excursion. One of the primary attributes is solution derives from the current problems the vehicle speed profile over the course. faced by the urban road traffic, such as The profile will make an interpretation of congestions, pollution, security issues .To straightforwardly into vitality necessities analyze this problem and solve the issue to for a given vehicle. In any case, the developed this proposed system, In this vehicle speed that a given driver picks will system, first collect the raw data of traffic fluctuate from driver to driver and every flow of different areas in metropolitan city now and then, and might be slower, then analysis on traffic data, after equivalent to, or quicker than the normal analyzing find out traffic areas in traffic stream. On the off chance that the metropolitan city. Then system can also explicit driver speed profile can be predict how much traffic increase in next anticipated, the vitality utilization can be few days or year and how to control them. improved over the course picked. The User also avoids going in particular area motivation behind this paper is to look into for at time of large special events. the use of Deep Learning systems to this issue to recognize toward the start of a drive cycle the driver explicit vehicle speed profile for an individual driver 2 . LITERATURE SURVEY ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 304

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

rehashed drive cycle, which can be utilized in an advancement calculation to limit the measure of petroleum derivative vitality utilized amid the excursion. Youness Riouali et.al[2] states that traffic flow demonstrating is a fundamental advance for planning and controlling the transportation frameworks. It isn't vital for enhancing wellbeing and transportation efficiency, yet additionally it can yield financial and ecological benefits. Consider the discrete and consistent parts of traffic flow elements, half and half Petri nets have turned out to be an incredible asset for moving toward this elements and portray the vehicle conduct precisely since they incorporate the two perspectives. Another expansion of mixture petri net is exhibited in this paper for summing up the traffic flow demonstrating through considering state conditions on outer principles which can be planned and furthermore nondeterministic time, for example, stop sign or need streets. Also, a division of streets is proposed to manage the exact limitation of occasions. Leyre azpilicueta et.al [3] proposed an intelligent transportation frameworks (ITSs) are as of now under serious innovative work for making transportation more secure and progressively proficient. The improvement of such vehicular correspondence systems requires exact models for the spread channel. A key normal for these channels is their transient fluctuation and innate time-evolving insights, which majorly affect electromagnetic, spread expectation. This article researches the channel properties of a remote correspondence framework in a vehicular communication domain with deterministic displaying. An investigation of the physical radio channel engendering of a ultra-high-recurrence (UHF) radiorecurrence ID (RFID) framework for a vehicle-to-foundation (V2I) dispersing condition is exhibited. Another module was executed in the proposed site-explicit apparatus that considers the development of the vehicles, prompting existence ISSN:0975-887

recurrence models. The solid reliance on nature due to multipath engendering is exhibited. These outcomes can help in the distinguishing proof of the ideal area of the handsets to limit control utilization and increment benefit execution, enhancing vehicular correspondences in ITS. DAI Lei-lei et.al[4] introducing a view of characterizing and grouping the extensive uncommon occasions, the paper investigates the traveler stream conveyance qualities of substantial extraordinary occasions, thinks about the spatial and fleeting dispersion of street traffic stream encompassing the occasion regions. By summing up the traffic association and the executives encounters of model at home and abroad, joined with the arrangement results, the paper structures basic procedure of traffic association and the executives for various vast extraordinary occasions, proposes the static and dynamic traffic association techniques and the board methodologies, and plans the activity steps, which give a reference and direction to the traffic association routine with regards to expansive unique occasions. Thomas Liebig et.al[5] states circumstance subordinate course arranging assembles expanding enthusiasm as urban areas end up swarmed and stuck. A framework for individual outing arranging that consolidates future traffic perils in steering. Future traffic conditions are processed by a Spatio-Temporal Random Field dependent on a surge of sensor readings. Furthermore, our methodology gauges traffic flow in territories with low sensor inclusion utilizing a Gaussian Process Regression. The molding of spatial relapse on halfway forecasts of a discrete probabilistic graphical model permits to join authentic information, gushed online information and a rich reliance structure in the meantime. Exhibit the framework with a genuine use-case from Dublin city, Ireland. Shen, L et.al [6] introducing a centers around examining dynamic company

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 305

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

scattering models which could catch the inconstancy of traffic stream in a crosssectional traffic location condition. The dynamic models are connected to anticipate the advancement of traffic stream, and further used to create flag timing designs that account not just for the present condition of the framework yet in addition for the normal transient changes in rush hour gridlock streams. To explore factors influencing model exactness, including time-zone length, position of upstream traffic location gear, street area length, traffic volume, turning rates, and calculation time. The effect of these elements on the model's execution is delineated through a reenactment examination, and the calculation execution of models is talked about. The outcomes demonstrate that both the dynamic speedtruncated typical circulation model and dynamic Robertson display with elements beat their particular static adaptations, and that they can be additionally connected for dynamic control. Graf R et.al [7] proposed to future driving help frameworks will require an expansion capacity to deal with complex driving circumstances and to respond properly as indicated by circumstance criticality and prerequisites for hazard minimization. People, driving on motorways, can pass judgment, for instance, cut-in circumstances of vehicles due to their encounters. The thought displayed in this paper is to adjust these human capacities to specialized frameworks and learn distinctive circumstances after some time. Case-Based Reasoning is connected to foresee the conduct of street members since it joins a learning viewpoint, in light of information obtained from the driving history. This idea encourages acknowledgment by coordinating genuine driving circumstances against put away ones. In the main occasion, the idea is assessed on activity expectation of vehicles on neighbouring paths on motorways and spotlights on the part of

ISSN:0975-887

vehicles cutting into the path of the host vehicle. Shen, L. et al[8] states an Lacking of adequate worldly variety trademark examination and spatial connection estimations prompts constrained fulfillment accuracy, and represents a noteworthy test for an ITS. Utilizing the low-position nature and the spatial-worldly connection of traffic organize information, this paper proposes a novel way to deal with remake the missing traffic information dependent on low-position lattice factorization, which expounds the potential ramifications of the traffic network by disintegrated factor frameworks. To additionally misuse the worldly evolvement attributes and the spatial similitude of street joins, user plan a period arrangement imperative and a versatile Laplacian regularization spatial requirement to investigate the neighborhood association with street joins. The exploratory outcomes on six certifiable traffic informational collections demonstrate that our methodology beats alternate techniques and can effectively remake the street traffic information exactly for different basic misfortune modes. 3.EXISTING SYSTEM APPROACH In existing framework, in metropolitan city traffic jam is more as contrast with other urban city zone just as other rural area. So, traffic jams most well-known issue in current days. Traffic jam occurs when movement of vehicles is hampered at a particular place for some reasons over a certain period of time. If the number of vehicles plying on a street or road is increased than the maximum capacity it is built to sustain, it results in traffic jams. Traffic jam or traffic congestion is an everyday affair in big cities. It is the result of growing population and the increase in use of personal, public as well as commercial transport vehicles. The loss of the profitable time brought about by the automobile overloads isn't at all useful for

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 306

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

a Nation's affordable development. Moreover, it results in more wastage of fuel by stationary vehicles just contributing more to the natural contamination. There is likewise an expanded plausibility for street accidents as the vehicles need to stand or move in nearness to one another and furthermore as a result of forceful driving by baffled drivers. By and large, the time squandered in roads turned parking lots additionally prompts the monetary loss of the nation.There is traffic prediction in specified at special event as well as places. 4. PROPOSED SYSTEM APPROACH

Fig.1 Block Diagram of Proposed System

In proposed system, collect the raw data of traffic flow of different areas in metropolitan city. After collecting, analyzing, predict how much traffic increase in next few days or year and how to control them. The system designs common process of traffic organization and management for different large special events, proposes the static and dynamic traffic organization methods and management strategies, and designs the operation steps, which provide a reference and guidance for the traffic organization practice of large special events. In proposed system consists mainly 2 admin and user module.Admin play most important role in our traffic prediction system with performing their functionality like upload traffic dataset, upload route dataset, view user and traffic info. User can search the traffic with different scenario like search by location, search by ISSN:0975-887

season. Based on defining and classifying the large special events, this system analyzes the passenger flow distribution characteristics of large special events, studies the spatial and temporal distribution of road traffic flow surrounding the event areas also find traffic of particular areas. System can different recommended different route to user. 5. CONCLUTION The traffic jam in metropolitan city is most famous topic in present days. Different kind of peoples are faced problem of urban road traffic, road accident such as congestions, pollution, security problems .Due to this reasons, road traffic is increased day by day. To solve existing problem, to develop a new proposed system, in that can requirement gathering of the system and collect the raw data of traffic jam of different places in metropolitan city. After collecting, analyzing, predict how much traffic increase in next few days or year and how to control them. In view of characterizing and arranging the huge extraordinary occasions, this framework examines the traveller stream circulation attributes of vast uncommon occasions, thinks about the spatial and worldly dissemination of street traffic stream encompassing the occasion territories and give direction to the traffic association routine with regards to expansive exceptional occasions. 6. ACKNOWLEDGMENT This work is supported in a traffic prediction system of any state in india. Authors are thankful to Faculty of Engineering and Technology (FET), SavitribaiPhule Pune University,Pune for providing the facility to carry out the research work. REFERENCES [1] Joe Lemieux , Yuan Ma, ―Vehicle Speed Prediction Department

using Deep Learning ‖ of Electrical and Computer

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 307

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[2]

[3]

[4]

[5]

Engineering, University of Michigan Dearborn, Mi USA. 2015Conc Youness Riouali, Laila Benhlima, Slimane Bah ―Petri net extension for traffic road modelling‖ Mohammadia School of Engineers Mohammed V University of Rabat AMIPSMorocco, Rabat 2016 Leyre azpilicueta , césar vargas-rosales, and Francisco Falcone, ―Intelligent vehicle communication Deterministic Propagation Prediction in Transportation Systems‖ IEEE vehicular technology magazine 2016. DAI Lei-lei, GU Jin-gang, SUN Zheng-liang, QIU Hong-tong ―Study on Traffic Organization and Management Strategies for Large Special Events.‖ International Conference on System Science and Engineering, Dalian, China, 2012. Thomas Liebig, Nico Piatkowski, Christian Bockermann, and Katharina Morik. ―Route Planning with Real-Time Traffic

ISSN:0975-887

[6]

[7]

[8]

Predictions.‖ TU Dortmund University, Dortmund, Germany 2014 Shen, L., Liu, R., Yao, Z., Wu, W., & Yang, H. (2018). ―Development of Dynamic Platoon Dispersion Models for Predictive Traffic Signal Control.‖ IEEE Transactions on Intelligent Transportation Systems, 1–10. doi:10.1109/tits.2018.2815182 Graf R., Deusch, H., Fritzsche, M., & Dietmayer, K. (2013). ―A learning concept for behavior prediction in traffic situations.‖ 2013 IEEE Intelligent Vehicles Symposium (IV). doi:10.1109/ivs.2013.6629544 Shen, L., Liu, R., Yao, Z., Wu, W., & Yang, H. (2018). ―Development of Dynamic Platoon Dispersion Models for Predictive Traffic Signal Control.‖ IEEE Transactions on Intelligent Transportation Systems, 1–10. doi:10.1109/tits.2018.2815182.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 308

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

DETERMINING DISEASES USING ADVANCE DECISION TREE IN DATA MINING TECHNOLOGY Vrushali Punde1, Priyanka Pandit2, Sharwari Nemane3 1,2,3

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. [email protected], [email protected], [email protected]

ABSTRACT Heart disease is the leading cause of death amongst all other diseases. The number of people suffering from heart disease is on a rise each year. This prompts for its early diagnosis and treatment. Due to lack of resources in the medical field, the prediction of heart disease may not be possible occasionally. This paper addresses the issue of prediction of heart disease according to the input attributes on the basis of data mining techniques. Mining is a method of exploring massive datasets to find hidden patterns and knowledge discovery. The large data available from medical diagnosis is analyzed using advance decision tree algorithm. Using this, the hospitals could offer better diagnosis and treatment to the patient to attain a good quality of service. Keywords Decision Tree, Machine Learning, QA System, heart disease prediction. of getting heart disease given patient data 1. INTRODUCTION The main reason for death worldwide, set. Prophecies‘ and descriptions are including South Africa is heart attack principal goals of data mining; in practice diseases and possible detection at an Prediction in data mining involves earlier stage will prevent these attacks. attributes or variables in the data set to Medical practitioners generate data with a locate unknown or future state values of wealth of concealed information present, other attributes. Description emphasize on and it‘s not used effectively for discovering patterns that describes the data predictions. For this reason, the research to be interpreted by humans. converts the unused data into a dataset for shaping using different data mining 2. MOTIVATION techniques. People die having encountered The huge data growth in biomedical and symptoms that were not taken into healthcare businesses, accurate analysis of considerations. There is a requirement for medical data benefits from early detection, medical practitioners to defined heart patient care, and community services. The disease before they occur in their patients. analysis accuracy is reduced if the quality The features that increase the chances of of the medical data is incomplete. heart attacks are smoking, lack of physical exercises, high blood pressure, high 3. LITERATURE SURVEY cholesterol, unhealthy diet, detrimental use Literature survey is the most important of alcohol, and high sugar levels . Cardio step in any kind of research. Before start Vascular Disease (CVD) constitutes developing we need to study the previous coronary heart, cerebro-vascular or Stroke, papers of our domain which we are hypertensive heart disease, congenital working and on the basis of study we can heart, peripheral artery, rheumatic heart predict or generate the drawback and start disease, and inflammatory heart disease. working with the reference of previous Data mining is a knowledge discovery papers. In this section, we briefly review technique to examine data and encapsulate the related work on Heart disease it into useful information. The current prediction and their different techniques. research intends to forecast the probability ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 309

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[1]Classification of Heart Diseases using K Nearest Neighbor and Genetic Algorithm(2013) Nearest neighbor (KNN) is very simple, most popular, highly efficient and effective technique for pattern recognition. KNN is a straight forward classifier, where parts are classified based on the class of their nearest neighbor. Medical databases have big volume in nature. If the data set contains excessive and irrelevant attributes, classification may create less accurate result. Heart disease is the main cause of death in INDIA. In Andhra Pradesh heart disease was the prime cause of mortality accounting for 32% of all deaths, a rate as high as Canada (35%) and USA. Hence there is a need to define a decision support system that helps clinicians to take precautionary steps. This work proposed a new technique which combines KNN with genetic technique for effective classification. Genetic technique performs global search in complex, large and multimodal landscapes and provides optimal solution. [2]A Survey of Non-Local Means based Filters for Image De-noising(2013) Image de-noising includes the manipulation of the image data to produce a visually high quality image. The Non Local means filter is originally designed for Gaussian noise removal and the filter is changed to adapt for speckle noise reduction. Speckle noise is the initial source of medical ultrasound imaging noise and it should be filtered out. This work reviews the existing Non-Local Means based filters for image de-noising. [3]Improved Study Of Heart Disease Prediction System Using Data Mining Classification Techniques(2012) This work has analyzed prediction systems for Heart disease using more number of input attributes. The work uses medical terms such as sex, blood pressure, cholesterol like 13 attributes to predict the likelihood of patient getting a Heart disease. Until now, 13 attributes are used for prediction. This research work added ISSN:0975-887

two more attributes i.e. obesity and smoking. The data mining classification algorithms, namely Decision Trees, Naive Bayes, and Neural Networks are analyzed on Heart disease database. [4]Cardio Vascular Disease Prediction System using Genetic Algorithm(2012) Medical Diagnosis Systems play important role in medical practice and are used by medical practitioners for diagnosis and treatment. In this work, a medical diagnosis system is defined for predicting the risk of cardiovascular disease. This system is built by combining the relative advantages of genetic technique and neural network. Multilayered feed forward neural networks are particularly adapted to complex classification problems. The weights of the neural network are determined using genetic technique because it finds acceptably good set of weights in less number of iterations. [5]Wavelet Based QRS Complex Detection of ECG Signal(2012) A wide range of heart condition is defined by thorough examination of the features of the ECG report. Automatic extraction of time plane features is valuable for identification of vital cardiac diseases. This work presents a multiresolution wavelet transform based system for detection 'P', 'Q', 'R', 'S', 'T' peaks complex from original ECG signal. 'R-R' time lapse is an important minutia of the ECG signal that corresponds to the heartbeat of the related person. Abrupt increase in height of the 'R' wave or changes in the measurement of the 'R-R' denote various anomalies of human heart. Similarly 'P-P', 'Q-Q', 'S-S', 'T-T' also corresponds to various anomalies of heart and their peak amplitude also envisages other cardiac diseases. In this proposed method the 'PQRST' peaks are marked and stored over the entire signal and the time interval between two consecutive 'R' peaks and other peaks interval are measured to find anomalies in behavior of heart, if any. [6] Heart Disease Diagnosis using Data Mining Technique - Decision Tree It has

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 310

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

tremendous efficiency using fourteen attributes, after applying genetic algorithm to reduce the actual data size to get the optimal subset of attribute acceptable for heart disease prediction. [7] Predictions in heart disease using techniques of Data Mining - Different classification techniques of data mining They have merits and demerits for data classification and knowledge extraction.

[8] Disease Prediction by Machine Learning over Big Data from Healthcare Communities This paper proposes a a new convolutional neural network based multimodal disease risk prediction algorithm using structured and unstructured data from hospital.

4. GAP ANALYSIS Sr no 1

2

3

4

5

Author

Title

M.Akhil jabbar B.L Deekshatulua Priti Chandra b

Classification of Heart Diseases using K Nearest Neighbor and Genetic Algorithm

Beshiba Wilson, Dr.Julia Punitha Malar Dhas Chaitrali S. Dangare Sulabha S. Apte, PhD.

A Survey of NonLocal Means based Filters for Image De-noising

M.Akhil jabbar a*, Dr.Priti Chandrab , Dr.B.L Deekshatuluc K.V.L.Naraya na, A.Bhujanga Rao

Publishe r M.Akhil jabbar* B.L Deekshat ulua Priti

Conclusion

Limitations

Use KNN and Genetic algorithm for heart disease detection

1.Low accuracy 2.Limited dataset used

IJERT

This paper reviews the existing Non-Local Means based filters for image Denoising.

Improved Study Of Heart Disease Prediction System Using Data Mining Classification Techniques

Internatio nal Journal of Compute r Applicati ons

Cardio Vascular Disease Prediction System using Genetic Algorithm Wavelet Based QRS Complex Detection of ECG Signal

ICECIT

This work has analyzed prediction systems for Heart disease using more number of input attributes. The work uses medical terms such as sex, blood pressure, cholesterol like attributes to predict the likelihood of patient getting a Heart disease. Until now, attributes are used for prediction. This system is built by combining the relative advantages of genetic technique

1.Worked on image noising 2.worked only on images not on text 1.Limited dataset used

iiste

5. PROPOSED SYSTEM This work is used for finding heart diseases. Based on risk factor the heart diseases can be predicted very easily. The main aim of this paper is to predict the ISSN:0975-887

A wide range of heart condition is defined by thorough examination of the features of the ECG report

1.Accuracy low

1.Data not extracted properly. 2.Accuracy is low.

heart diagnosis. First, the heart numeric dataset is extracted and pre-process them. After that using extract the features that are condition to be find to be classified by Decision Tree (DT). Compared to existing;

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 311

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

algorithms provides better performance. After classification, performance criteria including accuracy, precision, F-measure is to be calculated. The comparison measure reveals that Decision tree is the best classifier for the diagnosis of heart disease the existing data.  onRules are easily generated. Advantages:  Predict heart disease for Structured Data using machine learning 

   

algorithms i.e., Decision Tree (DT). Find reliable answer using this system To achieve better accuracy. Easy to understand, interpret. Implicitly selection.

performs

feature

Fig. System Architecture Table: Accuracy Graph for Heart Dataset

Existing System

Proposed System

Precisio n

0.825

0.9

Recall

0.825

0.9

FMeasure

0.825

0.9

6. CONCLUSION AND FUTURE WORK This work can be enhanced by increasing number of attributes for disease prediction, making this system more accurate. Thus by using decision tree algorithm on specific attributes, classification model is generated. Using this classification model we predict the heart diseases. ISSN:0975-887

This work can be enhanced by increasing number of attributes for disease prediction, making this system more accurate. REFERENCES [1] Sarath Babu, Vivek EM, Famina KP ―Heart

[2]

Disease Diagnosis using Data Mining Technique‖ International Conference on Electronics Communication and Aerospace technology, ICECA 2017. Beshiba Wilson, Dr.Julia Punitha Malar Dhas ―A Survey of Non-Local Means based Filters for Image Denoising‖ International Journal of Engineering Research & Technology, Vol.2 Issue 10 (October – 2013).

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 312

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[3] Chaitrali S Dangare ―Improved Study Of Heart Disease Prediction System Using Data Mining Classification Techniques‖, International Journal Of Computer Applications, Vol.47, No.10 (June 2012). [4] Amma, N.G.B ―Cardio Vascular Disease Prediction System using Genetic Algorithm‖, IEEE International Conference on Computing, Communication and Applications, 2012. [5] Sayantan Mukhopadhyay1 , Shouvik Biswas2 , Anamitra Bardhan Roy3 , Nilanjan Dey4‘ Wavelet Based QRS Complex Detection of ECG Signal‘ International Journal of Engineering Research and Applications (IJERA) Vol. 2, Issue 3, May-Jun 2012, pp.2361-2365 [6] Algorithm M.Akhil jabbar B.L Deekshatulua Priti Chandra International ―Classification of Heart Disease Using K- Nearest Neighbor and Genetic Algorithm‖ Conference on Computational Intelligence: Modeling Techniques and Applications (CIMTA) 2013. [7] Monika Gandhi, Dr. Shailendra Narayan Singh ―predictions in Heart Disease using Techniques of Data Mining‖ International Conference on futuristic trend in Computational Analysis and Knowledge Management (ABLAZE) 2015. [8] Min Chen, Yixue How, Kai Hwang ―Disease Prediction by Machine Learning over Big Data by Heart Care communities‖ IEEE 2016. [9] Jyoti Soni, Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction, International Journal of Computer Applications (0975 – 8887) Volume 17– No.8, March 2011. [10] P.K. Anooj, Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules, Journal of King Saud University – Computer and Information Sciences (2012) 24, 27–40. [11] Nidhi Bhatla and Kiran Jyoti, An Analysis of Heart Disease Prediction using Different Data Mining Techniques, International Journal of Engineering Research & Technology (IJERT) Vol. 1 Issue 8, October - 2012 ISSN: 2278-0181. [12] Aditya Methaila, Prince Kansal, Himanshu Arya, Pankaj Kumar, Early Heart Disease Prediction Using Data Mining Echniques, Sundarapandian et al. (Eds) : CCSEIT, DMDB, ICBB, MoWiN, AIAP – 2014 pp. 53–59, 2014. © CS & IT-CSCP 2014 DOI : 10.5121/csit.2014.4807.

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 313

Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

SURVEY PAPER ON MULTIMEDIA RETRIEVAL USING SEMANTIC CROSS MEDIA HASHING METHOD Prof.B.D.Thorat1, Akash Parulekar2, Mandar Bedage3, Ankit Patil4 ,Dipali Gome5

1,2,3,4,5

Department of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. [email protected], [email protected], [email protected] [email protected], [email protected]

ABSTRACT Storage requirements for visual and Text data have increased in recent years, following the appearance of many interactive multimedia services and applications for mobile devices in personal and business scenarios. Hash methods are useful for a variety of tasks and have attracted great attention in recent times. They proposed different approaches to capture the similarities between text and images. However, most of the existing work use bag-of-words method is used to represent text information. Since words with different forms may have a similar meaning, the similarities of the semantic text cannot be well worked out in these methods. To address these challenges in this paper, a new method called semantic cross-media hashing (SCMH), which uses the continuous representations of the proposed words capturing the semantic textual similarity level and the use of a deep conviction network (DBN) to build the correlation between different modes. To demonstrate the effectiveness of the proposed method, it is necessary to consider three commonly used data sets that are considered basic. Experimental results show that the proposed method achieves significantly better results in addition, the effectiveness of the proposed method is similar or superior to other hash methods. Keywords Semantic cross media hashing Method, SIFT Descriptor, Word Embedding, Ranking, Mapping multiple modality and retain / protect the 1. INTRODUCTION With the fast development of internet and similarity relation in each respective multimedia, information with various form modalities. Generally hashing method has become enough smooth, simple and divided into 2 categories: matrix easier to access, modify and duplicate. decomposition method and vector based Information with various forms may have method. Matrix decomposition based semantic correlation for example a hashing method search low dimensional microblogs in Facebook often consist of spaces to construct data and quantify the tag, a video in YouTube is always reconstruction coefficient to obtain binary associated with related description or tag codes. Such kind of methods avoid graph as semantic information inherently consist construction and Eigen decomposition. of data with different modality provide an The drawback with such methods, causes great emerging demand for the large quantization errors which detorate applications like cross media retrieval, such performance for large code length. image annotation and recommendation We have design multi-modal hashing system. Therefore, the hash similarity model SCMH which focuses on Image and methods which calculates or approximate Text type of data with binary search suggested and received a representation Hashing. This method remarkable attention in last few years. The processed text data using Skip gram model core problem of hash learning is how to and image data using SIFT Descriptor. formulate underlay co-relation between ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 314

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

After it generates hash code using Deep Neural network by avoiding duplicates. Motivation  In existing, use Canonical Correlation Analysis (CCA), manifolds learning, dual-wing harmoniums, deep auto encoder, and deep Boltzmann machine to approach the task.  Due to the efficiency of hashingbased methods, there also exists a rich line of work focusing the problem of mapping multi-modal high-dimensional data to lowdimensional hash codes, such as Latent semantic sparse hashing (LSSH) , discriminative coupled dictionary hashing (DCDH), Crossview Hashing (CVH), and so on.

2. RELATED WORK Literature survey is the most important step in any kind of research. Before start developing we need to study the previous papers of our domain which we are working and on the basis of study we can predict or generate the drawback and start working with the reference of previous papers. In this section, we briefly review the related work on Tag Search and Image Search and their different techniques. Y. Gong, S. Lazebnik, A. Gordo, and F. Perronnin: This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-cantered data so as to minimize the quantization error of mapping this data to the vertices of a zerocantered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task [1]. Y. Pan, T. Yao, T. Mei, H. Li, C.-W. Ngo, and Y. Rui: we demonstrate in this paper that the above two fundamental ISSN:0975-887

challenges can be mitigated by jointly exploring the cross-view learning and the use of click-through data. The former aims to create a latent subspace with the ability in comparing information from the original incomparable views (i.e., textual and visual views), while the latter explores the largely available and freely accessible click-through data (i.e., ―crowdsourced‖ human intelligence) for understanding query [2]. D. Zhai, H. Chang, Y. Zhen, X. Liu, X. Chen, and W. Gao: In this paper, we study HFL in the context of multimodal data for cross-view similarity search. We present a novel multimodal HFL method, called Parametric Local Multimodal Hashing (PLMH), which learns a set of hash functions to locally adapt to the data structure of each modality [3]. G. Ding, Y. Guo, and J. Zhou: In this paper, we study the problems of learning hash functions in the context of multimodal data for cross-view similarity search. We put forward a novel hashing method, which is referred to Collective Matrix Factorization Hashing (CMFH) [4]. H. J_egou, F. Perronnin, M. Douze, J. S_anchez, P. P_erez, and C. Schmid: This paper addresses the problem of largescale image search. Three constraints have to be taken into account: search accuracy, efficiency, and memory usage. We first present and evaluate different ways of aggregating local image descriptors into a vector and show that the Fisher kernel achieves better performance than the reference bag-of-visual words approach for any given vector dimension [5]. J. Zhou, G. Ding, and Y. Guo: In this paper, we propose a novel Latent Semantic Sparse Hashing (LSSH) to perform crossmodal similarity search by employing Sparse Coding and Matrix Factorization. In particular, LSSH uses Sparse Coding to capture the salient structures of images, and Matrix Factorization to learn the latent concepts from text [6]. Z. Yu, F. Wu, Y. Yang, Q. Tian, J. Luo, and Y. Zhuang: In DCDH, the

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 315

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

coupled dictionary for each modality is learned with side information (e.g., categories). As a result, the coupled dictionaries not only preserve the intrasimilarity and inter-correlation among multi-modal data, but also contain dictionary atoms that are semantically discriminative (i.e., the data from the same category is reconstructed by the similar dictionary atoms) [7]. H. Zhang, J. Yuan, X. Gao, and Z. Chen: In this paper, we propose a new cross-media retrieval method based on short-term and long-term relevance feedback. Our method mainly focuses on two typical types of media data, i.e. image and audio. First, we build multimodal representation via statistical canonical correlation between image and audio feature matrices, and define cross-media distance metric for similarity measure; then we propose optimization strategy based on relevance feedback, which fuses short-term learning results and long-term accumulated knowledge into the objective function [8]. A. Karpathy and L. Fei-Fei: We present a model that generates natural language descriptions of images and their regions. Our approach leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between language and visual data. Our alignment model is based on a novel combination of Convolution Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding [9]. J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen: In this paper, we present a new multimedia retrieval paradigm to innovate large-scale search of heterogeneous multimedia data. It is able to return results of different media types from heterogeneous data sources, e.g., using a query image to retrieve relevant

ISSN:0975-887

text documents or images from different data sources [10]. 3. EXISTING SYSTEM Lot of work has been done in this field because of its extensive usage and applications. In this section, some of the approaches which have been implemented to achieve the same purpose are mentioned. These works are majorly differentiated by the algorithm for multimedia retrieval. In another research, the training set images were divide into blobs. Each such blob has a keyword associated with it. For any input test image, first it is divided into blobs and then the probability of a label describing a blob is found out using the information that was used to annotate the blobs in the training set. As my point of view when I studied the papers the issues are related to tag base search and image search. The challenge is to rank the top viewed images and making the diversity of that images is main task and the search has that diversity problem so the open issue is diversity. 4. PROPOSED SYSTEM We propose a novel hashing method, called semantic cross-media hashing (SCMH), to perform the near-duplicate detection and cross media retrieval task. We propose to use a set of word embeddings to represent textual information. Fisher kernel framework is incorporated to represent both textual and visual information with fixed length vectors. For mapping the Fisher vectors of different modalities, a deep belief network is proposed to perform the task. We evaluate the proposed method SCMH on two commonly used data sets. SCMH achieves better results than state-of-the-art methods with different the lengths of hash codes and also display query results in ranked order. Advantages:

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 316

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

  

We introduce a novel DBN based method to construct the correlation between different modalities. The Proposed method can significantly outperform the stateof-the-art methods. Improve the searching accuracy.

System Architecture:

Fig. System Architecture

5. CONCLUSION In this paper, propose a new SCMH novel hashing method for duplicate and crossmedia retrieval. We are proposing to use a word embedding to represent textual information. The Fisher Framework Kernel used to represent both textual and visual information with fixed length vectors. To map the Fisher vectors of different modes, a network of deep beliefs intends to do the operation. We appreciate the proposed method SCMH on Mriflicker dataset. In the Mriflicker data set, SCMH over other hashing methods, which manages the best results in this data sets, are text to image & image to Text tasks, respectively. Experimental results demonstrate effectiveness proposed method in the cross-media recovery method.

[1] Y. Gong, S. Lazebnik, A. Gordo, and F.

Perronnin, ―Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,‖ IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916–2929, Dec. 2013. [2] Y. Pan, T. Yao, T. Mei, H. Li, C.-W. Ngo, and Y. Rui, ―Clickthrough-based cross-view learning for image search,‖ in Proc. 37th Int.ACMSIGIR Conf. Res. Develop. Inf. Retrieval, 2014, pp. 717–726. [3] D. Zhai, H. Chang, Y. Zhen, X. Liu, X. Chen, and W. Gao, ―Parametric local multimodal hashing for cross-view similarity search,‖ in Proc. 23rd Int. Joint Conf. Artif. Intell., 2013, pp. 2754–2760. [4] G. Ding, Y. Guo, and J. Zhou, ―Collective matrix factorization hashing for multimodal data,‖ in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2014, pp. 2083–2090. [5] H. J_egou, F. Perronnin, M. Douze, J. S_anchez, P. P_erez, and C. Schmid, ―Aggregating local image descriptors into compact codes,‖ IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 9, pp. 1704–1716, Sep. 2011. [6] J. Zhou, G. Ding, and Y. Guo, ―Latent semantic sparse hashing for cross-modal similarity search,‖ in Proc. 37th Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2014, pp. 415–424. [7] Z. Yu, F. Wu, Y. Yang, Q. Tian, J. Luo, and Y. Zhuang, ―Discriminative coupled dictionary hashing for fast cross-media retrieval,‖ in Proc. 37th Int. ACM SIGIR Conf. Res. Develop. Inf. Retrieval, 2014, pp. 395–404. [8] H. Zhang, J. Yuan, X. Gao, and Z. Chen, ―Boosting cross-media retrieval via visualauditory feature analysis and relevance feedback,‖ in Proc. ACM Int. Conf. Multimedia, 2014, pp. 953–956. [9] A. Karpathy and L. Fei-Fei, ―Deep visualsemantic alignments for generating image descriptions,‖ in Proc. IEEE Conf. Comput. Vis. Pattern Recog., Boston, MA, USA, Jun. 2015, pp. 3128–3137. [10] J. Song, Y. Yang, Y. Yang, Z. Huang, and H. T. Shen, ―Inter-media hashing for largescale retrieval from heterogeneous data sources,‖ in Proc. Int. Conf. Manage. Data, 2013, pp. 785–796.

REFERENCES

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 317

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

MODERN LOGISTICS VEHICLE SYSTEM USING TRACKING AND SECURITY Arpit Sharma1 , Bakul Rangari2 , Rohit Walvekar3 , Bhagyashree Nivangune4 , Prof .G.Gunjal5 1,2,3,4,5

Department of Computer Engineering, Smt. Kashibai Navale College Of Engineering , Vadgaon(Bk),Pune, India.

ABSTRACT The Logistic management system have risen as of late with the improvement of Global Positioning System (GPS), portable correspondence advancements, sensor and remote systems administration innovations. The logistics management system are vital as they can add to a few advantages, for example, proposing right places for getting clients, expanding income of truck drivers, diminishing holding up time, car influxes and in addition limiting fuel utilization and henceforth expanding the quantity of treks the drivers can perform. The principle motivation behind this framework would supply required vehicles that would be utilized to meet client requests through the arranging, control and usage of the powerful development and capacity of related data and administrations from birthplace to goal. We need to give end to end security to client and supplier information by utilizing QR code idea. We are suggestion of closest best specialist organization as indicated by client intrigue and identify spam specialist coop. Coordination administration alludes to the duty and administration of plan and direct frameworks to control the development and land situating of crude materials, work-in-process, and completed inventories at the most minimal aggregate expense. Collaborations incorporates the organization of demand planning, stock, transportation, and the mix of warehousing, materials managing, and packaging, all joined all through an arrangement of workplaces. General Terms Intelligent transportation, Logistic system, QR Code, Request allocation, Vehicle routing. Keywords Keywords are your own designated keywords which can be used by the user. dynamic solicitation . The second 1. INTRODUCTION To settle the issues of conventional classification indicating vehicles as per movers and packers frameworks, an notable directions of the portability electronic arrangement has been suggested examples of clients utilizing GPS. that will permit both the clients and the specialist organizations to track the 2. MOTIVATION vehicles while transportation and The Transportation logistics systems have furthermore gives best administrations to emerged recently with the development of the clients at most minimal expense by Global Positioning System (GPS), mobile prescribing just accessible specialist communication technologies and wireless organizations at favored expense. In networking technologies. These are very Logistic frameworks concentrated degree important as they can contribute to several on open transportation administrations benefits such as suggesting right places for have been contemplated broadly. For the getting customer, increasing revenue to most part, these strategic administration drivers, reducing waiting time hence frameworks can be partitioned generally increasing the number of trips the drivers into two classifications. The primary class can perform. The main purpose of this demonstrating vehicles as indicated by the system is to supply transportation vehicles ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 318

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

that are used to meet customer demands through the planning, control and implementation of the effective movement and storage of related information and services from origin to destination and also maintain information of user in the form of QR code. The proposed system focuses on delivery of goods, raw materials, shifting home appliances, furniture while relocation. 3. LITERATURE SURVEY 1. An Automated Taxi Booking and Scheduling System This proposed structure displays an Automated Taxi Booking and Scheduling System with safe booking. The system gives an invaluable, ensured and safe holding for the two taxi drivers and enrolled customers through PDAs. For more customers are the in the time are arrived then issues occurred, there are no taxi parking, central working environments or a booking structure for the generous number of taxis. 2. Autonomous vehicle logistic system: Joint routing and charging strategy Principle point of this framework to roll out the unavoidable improvements more substantial. Begin from the general agreement that the business is changing and go further to indicate and measure the extent of progress. Inside a more perplexing and expanded versatility industry scene, occupant players will be compelled to at the same time contend on different fronts and participate with organization. City compose will supplant nation or district as the most significant division measurement that decides versatility conduct. 3. Integration of vehicle routing and resource allocation in a dynamic logistics network This proposed framework presents a multi-period, incorporated vehicle directing and asset distribution issue. Ignoring interdependencies between vehicle directing and asset portion appears ISSN:0975-887

to be mediocre. A combination of the two issues defeats this inadequacy. The two sub-issues can be settled successively (SP), by methods for various leveled basic leadership (FI), or model update (DI). The last two methodologies are gotten from Geoffrion's idea of model mix. An issue a stochastic programming approach regarding the transportation issue isn't resolved. 4. Product allocation to different types of distribution center in retail logistics networks In this system, study about novel solution approach is developed and applied to a real-life case of a leading European grocery retail chain. Learn about City compose will supplant nation or area as the most significant division measurement that decides versatility conduct. A further aspect arises from assuming identical store delivery frequencies in outbound transportation from all DC types. 5. The dynamic vehicle allocation problem with application in trucking companies in Brazil This paper manages the dynamic vehicle assignment issue (DVAP) in street transportation of full truckloads between terminals. The DVAP includes multiperiod asset allotment and comprises of characterizing the developments of an armada of vehicles that vehicle products between terminals with a wide land circulation. The consequences of a useful approval of the model and arrangement strategies proposed, isn't plainly specified. 6. Road-based goods transportation: A survey of real-world logistics applications from 2000 to 2015 This paper gives a review of the fundamental genuine utilizations of street based merchandise transportation over the previous 15 years. It audits papers in the territories of oil, gas and fuel transportation, retail, squander gathering and administration, mail and bundle conveyance and nourishment circulation. Take care of Integration of steering issues with different parts of the store network.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 319

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Another promising zone of research is the reconciliation of vehicle directing with other transportation modes, for example, ships and prepares isn't say. 7. Online to Offline Business: Urban Taxi Dispatching with PassengerDriver Matching Stability A stable marriage approach is proposed. It can deal with unequal numbers of passenger requests and taxis through matching them to dummy partners. For sharing taxi dispatches (multiple passenger requests can share a taxi), Passenger requests are packed through solving a maximum set packing problem. 8. Noah: A Dynamic Ridesharing System The framework analyzer will demonstrate the framework execution including normal holding up time, normal reroute rate, normal reaction time, and normal level of sharing. The System can't enable clients to ask for taxicabs from their present area.

4. GAP ANALYSIS Table 2. Comparison of existing and proposed System

Existing System In existing system admin have to provide authentication permission to provider and only admin can view vehicles, customers and providers. In this system, provider can add vehicles and drivers, also view customer requests and send notification to drivers. In this system, customers can view vehicles, ISSN:0975-887

Proposed System In this system, admin have to provide authentication permission to provider and can view vehicle, customers, providers, detection of spam service providers as well as ranking of service providers.

Customer can view vehicles and search

search vehicles, request vehicles and do payment according to the trip.

Logistic management systems are very important as they can contribute to several benefits such as suggesting right places for getting Customers, increasing revenue to truck drivers, reducing waiting time, avoiding traffic jams as well as minimizing fuel consumption and hence increasing the number of trips the drivers can perform.

vehicles, can request vehicles and track vehicles on map, and do Payments to service providers. Customer can review on the system. View or send information in form of QR code In this system, providers can add vehicles and drivers, also view customers request and send notification to drivers.

5.FIGURES/CAPTIONS This diagram depicts the actual working of the proposed system and all the functionalities it will perform. 6. PROPOSED WORK In the existing system for logistic management system, customers need to search for providers and the required vehicles to make transportation successful. This leads to increase in waiting time for

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 320

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

customer and also the customer is unable to trace out the current location of transported material. The primary concern in our framework is, we need to give end to end security to client and supplier information by utilizing QR code concept. In QR code parallel picture we need to shroud client and supplier information. Just approved client can see information. For customer interest mining we used collaborative filtering method. The fundamental rule of this strategy is suggestion of vehicle as per supplier benefit. Proposal is utilized to discover

client intrigue and give related occasion. Client Advice is a term which is utilized in the sense to enthusiasm mining. One can give direction for the issue or can basically give an answer. Direction, is apparently a supposition with request or control and even control. Suggestion takes after, a customer energy opening about organization is used for new customer to use authority association vehicle. We need to give end to end security to client and supplier information by utilizing QR code idea.

7. ACKNOWLEDGMENTS It gives us great pleasure in presenting the preliminary project report on modern logistics vehicle system using tracking and security. We would like to take this opportunity to thank my internal guide Prof.G.Gunjal for giving us all the help and guidance we needed. We are really grateful to them for their kind support Their valuable suggestions were very helpful. We are also grateful to Dr. P. N. Mahalle. Head of Computer Engineering Department.

REFERENCES

ISSN:0975-887

[1] AlbaraAwajan,―An Automated Taxi Booking and Scheduling System‖, Conference―Automation Engineering‖, 12 January 2015. [2] A. Holzapfel, H. Kuhn, and M. G. Sternbeck, ―Product allocation todifferent types of distribution center in retail logistics networks,‖ European Journal of Operational Research), February. 2016. [3] J. Q. Yu and A. Y. S. Lam, ―Autonomous vehicle logistic system: Joint routing and charging strategy,‖ IEEE Transaction of Intelligence Systems, 2016. [4] R. A..Vasco and R. Morabito, ―The dynamic vehicle allocation problem with application in trucking companies in Brazil,‖ Computers and Operational 24 April 2016. [5] L. C. Coelho, J. Renaud, and G. Laporte, ―Road-based goods transportation: A survey of real-world logistics applications from 2000 to

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 321

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

2015,‖ information Systems and operational Research March2016. [6] T. Huth and D. C. Mattfeld, ―Integration of vehicle routing and resourceallocation in a dynamic logistics network,‖ Transportation Research Part 15 July 2015. [7] HuanyangZheng and Jie Wu, ―Online to Offline Business: Urban Taxi Dispatching with

ISSN:0975-887

Passenger-Driver Matching Stability‖, IEEE 37 International Conference on Distributed Computing Systems,2017. [8] Cheng Qiao, Mingming Lu, Yong Zhang,and Kenneth, N. Brown, ―An Efficient Dispatch and Decision-making Model for Taxi-booking Service‖ 21st July,2016.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 322

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

NETWORK AND CYBER SECURITY

ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 323

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

ONLINE VOTING SYSTEM USING OTP Archit Bidkar1,Madhabi Ghosh2,Prajakta Madane3 ,Rohan Mahapatra4 ,Prof. Jyoti Nandimath5 1,2,3,4,5 Department of Computer, Smt. Kashibai Navale College of Engineering, Vadgaon, Pune, Maharashtra, India.

ABSTRACT Currently voting process throughout the world is done using Electronic Voting Machines. Though this system is widely followed, there are many drawbacks of the system. People have to travel to their assigned poll booth stations, wait in long queues to cast their vote, face unnecessary problems and so on. It becomes difficult for working profession people or elderly/ sick people to cast their vote due to this system. This calls for a change in system which can be done if voting processes in conducted online. Few developed countries are trying to implement online voting system on small scale and have been successful in doing so. We propose a system which overcomes limitations of existing online system which uses bio-metric technologies and instead use One Time Password system which is more secure and accurate. Key Terms Electronic Voting Machines (EVM), Online Voting System (OVS), One Time Password (OTP), Election Commission (EC) Internet voting. A total of 14 countries till 1. INTRODUCTION Online voting system will be a website. have now used online voting for political Online voting system is an online voting elections or referenda. technique in which people who are Indian Within the group of Internet voting citizens and age is above 18 years and are system users, four core countries have of any sex can cast their vote without been using Internet voting over the course going to any physical polling station. of several elections: Canada, Estonia, Online voting system is a software France and Switzerland. application through which a voter can cast Estonia is the only country to offer votes by filling forms themselves which Internet voting to the entire electorate. The are distributed in their respective ward remaining ten countries have either just adopted it, are currently piloting Internet offices. All the information in forms which voting, have piloted it and not pursued its has to be entered by data entry officer is further use, or have discontinued its use. stored in database. Each voter has to enter his all basic information like name, sex, 3. MOTIVATION religion, nationality, criminal record etc. The average election turnout over all correctly in form taken from ward offices. nine phases for 2014 Lok Sabha election was around 66.38 %. Due to current 2. EXISTING SYSTEM government‘s Digital India Campaign 88 The current system that most of the countries including India follow is voting % of households in India have a mobile phone. Many of the people are having by using Electronic Voting Machines. mobile phones and internet connection Before EVMs where introduced and even in rural areas. Due to expansion of legalized for voting procedure, Paper communication networks throughout Ballot system was used. India, casting vote online is a possible The first use of Internet/ Online voting idea. India‘s mobile phone subscriber base for a political election took place in the US in 2000, with more countries subsequently crested the 1 billion users mark, as per data released recently by the country‘s telecom beginning to conduct trials of and/or use ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 324

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

regulator. People of all age group must willingly exercise their right to vote without feeling any sort of dissatisfaction. Currently 42 % of internet users in India have an average internet connection speed of above 4 Mbit/s, 19 % have a speed of over 10 Mbit/s, and 10 % enjoy speeds over 15 Mbit/s. The average internet connection speed on mobile networks in India was 4.9 Mbit/s. With so many people connected to internet, the idea of using OVS is very much feasible & it also overcomes various other problems faced during election process such as creating awareness among rural areas and youths, cost reduction, security, etc. 4. SOME IMPORTANT POINTS FROM REVIEW OF OVS [6]1) Trust in Internet Voting –  Trust in the electoral process is essential for successful democracy. However, trust is a complex concept, which requires that individuals make rational decisions based on the facts to accept the integrity of Internet voting.  Technical institutions and experts can play an important role in this process, with voters trusting the procedural role played by independent institutions and experts in ensuring the overall integrity of the system.  One of the fundamental ways to enable trust is to ensure that information about the Internet voting system is made publicly available.  A vital aspect of integrity is ensured through testing, certification and audit mechanisms. These mechanisms will need to demonstrate that the security concerns presented by Internet voting have been adequately dealt with. 2) The Secrecy and Freedom of the Vote –  Ensuring the secrecy of the ballot is a significant concern in every voting situation. In the case of Internet voting from unsupervised environments, this ISSN:0975-887

principle may easily become the main challenge.  Given that an Internet voting system cannot ensure that voters are casting their ballots alone, the validity of Internet voting must be demonstrated on other grounds. 3) Accessibility of Internet Voting –  Improving accessibility to the voting process is often cited as a reason for introducing Internet voting. The accessibility of online voting systems, closely linked to usability is relevant not only for voters with disabilities and linguistic minorities, but also for the average voter.  The way in which voters are identified and authenticated can have a significant impact on the usability of the system, but a balance needs to be found between accessibility and integrity.  Different groups in society have different levels of access to the Internet. Therefore, the provision of online voting in societies where there is very unequal access to the Internet will have a different impact on accessibility for various communities. 4) Electoral Stakeholders and Their Roles –  The introduction of Internet voting significantly changes the role that stakeholders play in the electoral process. Not only do new stakeholders, such as voting technology suppliers, assume prominence in the Internet voting process, but existing stakeholders must adapt their roles in order to fulfill their existing functions.  Central to this new network of stakeholder relationships is public administration, especially the role of the EC. Public administration and the EC will establish the legal

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 325

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

and regulatory framework for the implementation of online voting; and this framework will define the roles and rights of the various stakeholders in the Internet voting process.  Internet voting introduces several new elements and points of inquiry for election observers. These include evaluating the security of voting servers, assessing the EC‘s monitoring of voting server security and threat response plans, and the functioning of Internet Service Providers (ISPs). 5. LITERATURE SURVEY Few reference papers that we have used for our project are In [1], the description is  In this paper authors propose an approach for effectively userfriendly application for all users. This system is being developed for use by everyone with a simple and self explanatory graphical user interface (GUI). The GUI at the server‘s end enables creating the polls on behalf of the client.  The authors also further try to experiment on televoting process i.e voting by sending SMS from user‘s registered mobile number.  They also propose multilanguage support so that an user can access and interact with website in the language he/she is comfortable with.  By using this proposed system, the percentage of absentee voting will decrease. In [2], the description is  Authors propose to build an EVoting system which is basically an online voting system through which people can cast their vote

ISSN:0975-887

through their smart phones or by using an e-voting website  Authentication technique proposed is - One Time Password (OTP). One Time Password principle produces pseudorandom password each time the user tries to log on. This OTP will be sent to voter‘s mobile phone.  An OTP is a password that is only valid for single login session thus improving the security. The system takes care that no voter can determine for whom anyone else voted and no voter can duplicate anyone elses vote.  This technique is imposed to ensure that only the valid person is allowed to vote in the elections. In [3], the description is –  Electronic voting system provides improved features of voting system over traditional voting system such as accuracy, convenience, flexibility, etc. The design of the system guarantees that no votes in favor of a given candidate are lost, due to improper tallying of the voting counts.  Authors propose to make full use of Aadhar Card of a person developed under UIDAI project to make the election process foolproof.  Their system has1) User Mode- User fills data according to his/her Aadhar card. The system then verifies it and allows user to have complete access to website. 2) Admin Mode- In this mode officers of EC will be appointed to keep watch on proceedings of election ad have authority to start, stop the election and procure the result too.  GAP ANALYSIS This will help us get clear idea about various ideas proposed.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 326

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

6. PROPOSED SYSTEM With India on a fast track progress of achieving the status of ―Digital India‖, there have been improvement in infrastructure of internet and mobile communication. Also many people are now made aware of various advantages of internet. With so much progress in this field, why not try to implement use of internet in voting process of India? With this project we are trying to help every Indian citizen who is above 18 years of age to vote for his/ her favored candidate without having any fear of being pressured by political party members or break any commitments. Instead one can vote from his/her home/ office/ institute any time before the deadline for that particular day‘s election. For this, we do not need any sort of elaborate infrastructure or expensive personal digital assistants. The user will fill out the registration form which is available at every ward office and also ISSN:0975-887

submit a copy of his/ her Aadhar card as an extra proof. During registration process, the user must correctly fill all personal information like name, mobile number, ward no, etc. Once the user submits completely filled form, the data operator will enter all the data in Election Commission database. Once user record is created, a set of username and password will be sent to registered user mobile number. On receiving this, the user can access voting website and on entering the received credentials, he/ she will be prompted to change password. For security reasons, username by default will be set as Aadhar card number of the user and cannot be changed. After successfully creating new password and logging in, user can view his/ her profile to check if there are any discrepancies. As an extra step of security, we propose to make use of One Time Password (OTP) for user log in. A onetime password, is a password that is valid

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 327

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

for only one login session or transaction, on a computer system or other digital device. OTPs avoid a number of shortcomings that are associated with traditional password-based authentication; a number of implementations also incorporate two factor authentications by ensuring that the one-time password requires access to something a person has (such as a smartcard or specific cellphone) as well as something a person knows (such as a PIN). This ensures that individual can vote only for her/himself and thus reducing fraud votes. Only when user enters correct Aadhar card number, mobile number and set password, then website will give option of Generate OTP. On clicking it an OTP will be sent to user mobile number within 2 minutes. On entering correct OTP, the user will be able to log in and cast vote. Once the user selects candidate he/she wants to vote for, the system will pop up a confirmation message. Once user selects confirm vote, he/ she will be automatically logged out from the website, thus preventing the user from voting again. Additionally, the website will have another option of Admin log in. The admins are officers selected by Election Commission who will monitor the voting as it progresses and will have their profile created by Election Commission. Their main task will be start/ stop election on time, make sure it progresses without any issues and generate local ward results once elections are finished and send it to Election Commission. On information front, the website will have details of all candidates that are selected by respective parties for different wards. On selecting any of the candidate name, complete information of that candidate will be displayed. Also various awareness programs that the Election Commission is conducting will be displayed. This will help voter to gain more knowledge of voting process.

ISSN:0975-887

The results, which the admins will send to ECI, will be further analyzed. The results will be broken down into result of each state, overall winner of election and by how much has a certain party beaten other competitors. ALGORITHM FOR PROPOSED OVS – Algorithm: Successful online voting Input: Biodata of voter & candidate, various wards‘ details. Output: Successful voting for voters and declaration of results. Steps: 1. Person must be 18 years of age or above. 2. Fill Form 6 for first time registration in respective ward office. 3. For changes in details, contact respective ward office. 4. Necessary documents must be submitted while doing steps 2 & 3. Failing to do so will result in rejection of form. 5. Once forms and documents are verified, data entry operator will enter person‘s details in database and a default password will be sent to user. 6. On receiving password, user must log in using it and must select new password to access website for further use. 7. Once new password is set, user can view profile, election related information. 8. If any discrepancies are found in profile, step 3 must be followed. 9. To cast vote, user must enter an OTP which will be sent on registered mobile no and is active for 1 minute. 10. If OTP is not received, repeat step 9. 11. Once user enters correct OTP, vote can be cast. 12. On successful voting, a confirmation message will be displayed and user will be logged out.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 328

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

13. Final result will be declared after election and all can view it. 7. CONCLUSION By doing this project we are trying to allow maximum number of people to vote. This way people can save time by avoiding standing in queues, vote for their choice of candidate, elders/ sick people can also cast heir votes without making any trip to polling stations and there will be overall increase in voter turnout.

[3]

[4]

REFERENCES [1]

[2]

Pallavi Divya, Piyush Aggarwal, Sanjay Ojha (School Of Management, Cen ter For Development of Advanced Computing (CDAC), Noida , ADVANCED ONLINE VOTING SYSTEM, International Journal of Scientific Research En gineering Technology (IJSRET) Volume 2 Issue 10 pp 687-691 January 2014 www.ijsret.org ISSN 2278 0882. Prof. Uttam Patil, Asst.Prof. at Dr.MSSCET. Computer Science branch Vaibhav More, Mahesh Patil ,8th Sem at Dr.MSSCET. Computer Science branch, Online Election Voting Using One Time Password , National Conference on Product Design (NCPD 2016), July 2016.

ISSN:0975-887

[5]

[6]

C.Tamizhvanan, S.Chandramohan, A.Mohamed Navfar, P.Pravin Kumar, R.Vinoth Assistant Professor1, B.Tech Student Department of Electronics and Communication Engineering Achariya College of Engineering Technology, Puducherry, India , Electronic Voting System Using Aadhaar Card, International Journal of Engineering Science and Computing, March 2018. Chetan Sontakke, Swapnil Payghan, Shivkumar Raut, Shubham Deshmukh, Mayuresh Chande, Prof. D. J. Manowar BE Student Assistant Professor Depart ment of Computer Science and Engineering KGIET, Darapur, Maharashtra, India, Online Voting System via Mobile ,International Journal of Engineering Science and Computing, May 2017. R.Sownderya, J.Vidhya, V.Viveka, M.Yuvarani and R.Prabhakar UG Scholar, Department of ECE, Vivekanandha College of Engineering for Women, India, Asian Journal of Applied Science and Technology (AJAST) Volume 1, Issue 2, Pages 6-10, March 2017. https://www.ndi.org/e-voting-guide/internetvoting For reviews of past online voting conducted by various countries.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 329

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

ACCIDENT DETECTION AND PREVENTION USING SMARTPHONE Sakshi Kottawar1, Mayuri Sarode2, Ajit Andhale3, Ashay Pajgade4, Shailesh Patil5 1,2,3,4

Student, Smt. Kashibai Navale College of Engineering ,Pune Assistant Professor, , Smt. Kashibai Navale College of Engineering ,Pune [email protected], [email protected], [email protected], [email protected], [email protected] 5

ABSTRACT Nonetheless, the number of accidents has continued to expand at an exponential rate. Due to dynamic nature of VANET positions, there will be considerable delay in transmission of messages to destination points. Android phones are broadly used due to its features like GPS, Computational capability, internet connectivity. Traffic blocking and Road accidents are the foremost problems in many areas. Also due to the interruption in realization of the accident position and the traffic congestion at between accident place and hospital increases the chances of the loss of victim. So in order to provide solution for this problem, we develop an android application which detects accident automatically. It makes use of the various sensors within the android phone such as accelerometer, gyroscope, magnetometer, etc. A low power consumption protocol is being used here for the effective transmission of messages and notifications to the third party vehicles. Keyword:Accelerometer, Gyroscope, GPS, Rollover, Deceleration, Accident Detection. chances of survival and recovery for 1. INTRODUCTION injured victims. Thus, once the accident The demand for emergency road services has occurred, it is crucial to efficiently and has risen around the world. Moreover, quickly manage the emergency rescue and changes in the role of emergency crews have occurred – from essentially resources. With the rapid development of society, transporting injured persons (to the there are some side effects including the hospital) to delivering basic treatment or increasing number of car accidents. On even advanced life support to patients average one out of every three motor before they arrive at the hospital. In vehicle accidents results in some type of addition, advances in science and technologies are changing the way injury. Traffic accidents are one of the leading causes of fatalities in most of the emergency rescue operates. countries. In times of road emergency, appropriately As number of vehicle increases mean skilled staffs and ambulances should be while the accident also increases. The dispatched to the scene without delay. government has taken number of actions Efficient roadside emergency services and so many awareness program also demand contacted even though the accident the knowledge of accurate information increases as population increases. about the patient (adult, child, etc), their The Proposed system can detect accident conditions (bleeding, conscious or automatically using accelerometer sensors unconscious, etc), and clinical needs. In and notify all the nearest applications user order to improve the chances of survival and emergency points (Police station, for passengers involved in car accidents, it Hospital). is desirable to reduce the response time of rescue teams and to optimize the medical and rescue resources needed. A faster and more efficient rescue will increase the ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 330

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

2. MOTIVATION The motivation for doing was primarily an interest in undertaking an challenging project in an interesting area of research. With the rapid development of society, there are some side effects including the increasing number of car accidents. On average one out of every three motor vehicle accidents results in some type of injury. Traffic accidents are one of the leading causes of fatalities in most of the countries. As number of vehicle increases mean while the accident also increases. The government has taken number of actions and so many awareness program also contacted even though the accident increases as population increases. There is need to design a system that will help to victim who suffer from accidents. Half of the fatalities are due to lack of quick medical aid. Many systems that make use of on board accident unit are in existence but there are no efficient systems that can detect accidents through smart phones. 3. LITERATURE SURVEY [1] Attila Bonyar,Oliver Krammer et al; The paper gives an overview on the existing eCall solutions for car accident detection. Sensors are utilized for crash sensing, for notification. eCall is an emergency call that can be generated either manually by passenger or automatically via activation of in-vehicle sensors when a serious accident detects. When system activated the in-vehicle eCall system established a 112 voice connection directly to the nearest safety answering point. Even if passenger is not able to speak, a minimum set of data (MSD) is sent to safety point which include location of crash site, the triggering mode, the vehicle identification number, timestamp, and current location. This way of information that is valuable for emergency res ponder to reaching them as soon as possible.

ISSN:0975-887

[2] Girts Strazdins, Artis Mednis, Georgijs Kanonirs et al; The paper showed one of the most popular smartphone platforms at the moment, and the popularity is even rising. Additionally, it is one of the most open and edible platforms providing software developers easy access to phone hardware and rich software. API. They envision Android-based smartphones as a powerful and widely used participatory sensing platform in near future. The paper they had examines Android smartphones in the context of road surface quality monitoring. They evaluated a set of pothole detection algorithms on Android phones with a sensing application while driving a car in urban environment. The results provide rest insight into hardware differences between various smartphone models and suggestions for further investigation and optimization of the algorithm, sensor choices and signal processing. [3] Jorge Zaldivar, Carlos T. Calafate et al;The paper combine smartphones with existing vehicles through an appropriate interface they are able to move closer to the smart vehicle paradigm, offering the user new functionality and services when driving. In this paper they propose an Android based application that monitors the vehicle through an On Board Diagnostics (OBD-II) interface, being able to detect accidents. They proposed application estimates the G force experienced by the passengers in case of a frontal collision, which is used together with airbag triggers to detect accidents. The application reacts to positive detection by sending details about the accident through either e-mail or SMS to pre-end destinations, immediately followed by an automatic phone call to the emergency services. Experimental results using a real vehicle show that the application is able to react to accident events in less than 3 seconds, a very low time, validating the feasibility of smart-phone based solutions for improving safety on the road.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 331

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

[4] Joaquim Ferreira, Arnaldo Oliveira et al;The paper gives the information of wireless vehicular networks for cooperative Intelligent Transport Systems (ITS) have raised widespread interest in the last few years, due to their potential applications and services. Cooperative applications with data sensing, acquisition, processing and communication provide an unprecedented potential to improve vehicle and road safety, passengers comfort and efficiency of track management and road monitoring. Safety, efficiency and comfort ITS applications exhibit tight latency and throughput requirements, for example safety critical services require guaranteed maximum latency lower than 100ms while most infotainment applications require QoS support and data rates higher than 1 Mbit/s. The mobile units of a vehicular network are the equivalent to nodes in a traditional wireless network, and can act as the source, destination or router of information. Communication between mobile nodes can be point-to- point, pointto-multipoint or broadcast, depending on the requirements of each application. Besides the adhoc implementation of a network consisting of neighboring vehicles joining up and establishing Vehicle-toVehicle (V2V) communication, there is also the possibility of a more traditional wireless network setup, with base stations along the roads in Vehicle-toInfrastructure (V2I) communication that work as access points and manage the owe of information, as well as portals to external WANs. [5] JCheng Bo,Xuesi Jian et al;The paper dense a critical task of dynamically detecting the simultaneous behavior of driving and texting using smartphone as the sensor. They propose, design and implement texive which achieve the goal of detecting texting operations during driving utilizing irregulaties and rich micro movements of user. Without relaying on any infrastructures and additional devices, and no need to bring any modification to ISSN:0975-887

vehicles, Texive is able to successfully detect dangerous operations with good sensitivity, specificity and accuracy by leveraging the inertial sensors integrated in regular smartphones. [6] Brian Dougherty, Adam Albright, and Douglas et al; The paper shows how smartphones in a wireless mobile sensor network can capture the streams of data provided by their accelerometers, compasses, and GPS sensors to provide a portable black box that detects traffic accidents and records data related to accident events, such as the G-forces (accelerations) experienced by the driver. It also present architecture for detecting car accidents based on WreckWatch, which is a mobile client/server application we developed to automatically detect car accidents. How sensors built into a smartphone detect a major acceleration event indicative of an accident and utilize the built-in 3G data connection to transmit that information to a central server. That server then processes the information and notices the authorities as well as any emergency contacts. [7] .Deepak Punetha, Deepak Kumar, Vartika Mehta et al; The paper shows how An accident is a deviation from expected behavior of event that adversely affects the property, living body or persons and the environment. Security in vehicle to vehicle communication or travelling is primary concern for everyone. The work presented in this article documents the designing of an accident detection system. The accident detection system design informs the police control room or any other emergency calling system about the accident. An accelerometer sensor has been used to detect abrupt change in g-forces in the vehicle due to accident. When the range of g- forces comes under the accident severity, then the microcontroller activates the GSM modem to send a pre-stored SMS to a preened phone number. Also a buzzer is switched on. The product design was tested in various conditions. The test result

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 332

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

confirms the stability and reliability of the system [8] Alexandra Fanca, Adela Puscasiu et al; The paper gives the information about implementation of system, able to achieve a set of information from the user, information that associated with a location using a GPS tracking system and creates an accident report. The system sense the Gps coordinates of the person, display the coordinates on map and computes the shortest root to the accident site. Also, the system is automatic detect the accident when occurs. The paper focuses on mobile part of the system.

4. GAP ANALYSIS Pa Smart Micr pe Phone or used contr No oller 10. No Yes

GPS accu racy No

11.

No

Yes

Yes

12.

Yes

No

Yes

13.

Yes

No

Yes

14.

No

Yes

Yes

Sen sor

Cost Effe ctive Exte No rnal Exte No rnal Inte Yes rnal Inte No rnal Exte No rnal

Table 4.1: Gap Analysis table

5. EXISTING SYSTEM This system has used algorithm for accident detection which is about using accelerometer sensor in the vehicle side. And at the receiver side the location of the accidents can be known by displaying the occurrence location name with the newly developed android application. By identifying the changes in the accelerometer sensor tilt the possibility of accident can be known. This system adopted two different technologies namely embedded and android. Embedded technology is used to detect the accident using accelerometer sensor and android technology is used to determine that location instead of latitude and longitude values so that even a layman can understand these values and can know about the vehicle location. And andoid app that specifies the location name when the mobile receives GPS data plays a major role in this system. The major limitation of the system is signal to the GPS receiver. GPS receiver requires good signal conditioning so as to ensure exact or correct location data. And also the system is cost inefficient.

smartphone. Our System is providing an alert about accident prone areas soon as the vehicles enters into these region. As soon as certain events of rollover, deceleration etc. are detected by the android sensors the accident confirmation must be provided. The response needs to quick. On confirmation of accident the concerned authorities must be contacted immediately. If a certain area has more number of accidents and it is not registered within the app the details of such area will be reported by the users. events of rollover, deceleration etc. are detected by the android sensors the accident confirmation must be provided. The response needs to quick. On confirmation of accident the concerned authorities must be contacted immediately. If a certain area has more number of accidents and it is not registered within the app the details of such area will be reported by the users. Flowchart is given below for further understanding of this application. Flowchart

6. PROPOSED WORK Today, almost everyone in the world have smartphone in hand. In this project we present an android application, a lightweight, flexible and power efficient ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 333

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

false positives as well as its capabilities for accident reconstruction. REFERENCES

Figure 6.1. Accident Scenario

7. CONCLUSION Accident detection systems facilitate decrease dead stemming from car accidents by decrease the reaction time of emergency responders. The greatest advantage of this project is that it needs no cellular networks and it fully utilizes the capabilities of the smart phone. This project provides two offerings to the learning of using smartphone based accident detection systems. First, we explain solutions to key issues connected with detecting traffic accidents, such as preventing false positives by utilizing portable environment information and polling onboard sensors to detect huge accelerations. Second, we present the architecture of our prototype smartphonebased accident detection system and empirically analyze its capability to resist

ISSN:0975-887

[1] Abdul Khaliq, Amir Qayyum, Jurgen Pannek, ―Prototype of Automatic Accident and Management in Vehicular Environment using VANET and IOT‖,Nov 2017. [2] Bruno Fernandes, Vitor Gomes,Arnaldo Oliveira, ―Mobile application for automatic accident detection multimodal alert‖, Oct 2015. [3] Jie Yang, Jie Wang, Benyuan Liu, ―An Intersection collision warning system using WiFi smartphones in VANET‖, 2012. [4] Sneha R.Sontakke, Dr.A.D.Gawande, ―Crash notification system for portable devices‖, Nov 201 [5] G. Jaya Suma, R.V.S. Lalitha, ―Revitalizing VANET communication using Bluetooth devices‖, 2016. [6] M.B.I. Reaz, Md. Syedul Amin, Jubayer Jalil, ―Accident detection and reporting using GPS,GPRS and GSM technology‖, 2012. [7] Evellyn S.Cavalcante, Andre L.L. Aquino, Antonio A.F. Loureiro, ―Roadside unit deployment for information dissemination in a VANET‖, 2018. [8] Hamid M. Ali, Zainab S. alwan, "Car accident detection and notification system using smartphone",2015. [9] Oliver Walter, Joerg Schmalenstroeer, Andreas Engler, "Smartphone based sensor fusion for improved vehicular navigation",2013. [10] Parag Parmar, Ashok M. Sapkal, ―Real time Detection and reporting of vehicle collision‖,2017 [11] Dr. Sasi Kumar, Soumyalatha, Shruti G Hegde,‖ Iot approach to save life using GPS for the traveller during Accident‖,2017 [12] Jayanta Pal, Bipul Islam,‖Method for smartphone based accident detection‖,2018 [13] Henry Messenger, Leonid Baryudin,‖Fall detection system using a combination of accelerometer audio input and magnetometer‖,2017. [14] Bannaravuri, Amrutha Valli,‖Vehicle positioning system with accident detection using accelerometer sensor and android technology‖,2017

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 334

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

GENERATION OF MULTI-COLOR QR CODE USING VISUAL SECRET SHARING SCHEME Nirup Kumar Satpathy1, Sandhya Barikrao Ingole2, Pari Sabharwal3, Harmanjeet Kour4 1,2,3,4 Dept. of Computer Engineering,Smt. Kashibai Navale College of Engineering,Savitribai Phule Pune University,Pune,India. [email protected], [email protected], [email protected], [email protected]

ABSTRACT The QR code was intended for storage data and fast reading applications. Quick Response (QR) codes were extensively used in fast reading applications such as statistics storage and high-speed device reading. Anyone can gain get right of entry to data saved in QR codes; hence, they're incompatible for encoding secret statistics without the addition of cryptography or other safety. This paper proposes a visual secret sharing scheme to encode a secret QR code into distinct shares. In assessment with other techniques, the shares in proposed scheme are valid QR codes that may be decoded with some unique that means of a trendy QR code reader, so that escaping increases suspicious attackers. In addition, the secret message is recovered with the aid of XOR-ing the qualified shares. This operation which can effortlessly be achieved the use of smartphones or different QR scanning gadgets. Contribution work is, to maximize the storage size of QR code and generating multi-colored QR code. Experimental results show that the proposed scheme is feasible and cost is low. Two division approaches are provided, which effectively improves the sharing efficiency of (k, n) method. Proposed scheme's high sharing performance is likewise highlighted in this paper. KEYWORDS Division algorithm, error correction capacity, high security, (k, n) access structure, Quick Response code, visual secret sharing scheme 1. INTRODUCTION In recent years, the QR code is widely used. In daily life, QR codes are used in a variety of scenarios that include information storage, web links, traceability, identification and authentication. First, the QR code is easy to be computer equipment identification, for example, mobile phones, scanning guns. Second, QR code has a large storage capacity, anti-damage strong, cheap and so on. Specific QR code structure As represented in Fig. 1, the QR code has a unique structure for geometrical correction and high speed decoding. Three position tags are used for QR code detection and orientation correction. One or more alignment patterns are used to code deformation arrangement. ISSN:0975-887

Fig. 1: Specific QR Code Structure

The module get it together is set by timing patterns. Furthermore, the format information areas contain error correction level and mask pattern. The code version and error correction bits are stored in the version information areas. The popularity of QR codes is primarily due to the following features:  QR code robust to the copying process,

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 335

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019



It is easy to read by any device and any user,  It has high encoding capacity enhanced by error correction facilities,  It is in small size and robust to geometrical distortion. Visual cryptography is a new secret sharing technology. It improves the secret share images to restore the complexity of the secret, relying on human visual decryption. Compared with traditional cryptography, it has the advantages of concealment, security, and the simplicity of secret recovery. The method of visual cryptography provided high security requirements of the users and protects them against various security attacks. It is easy to generate value in business applications. In this paper, proposed a standard multi-color QR code using textured patterns on data hiding by text steganography and providing security on data by using visual secret sharing scheme 2. MOTIVATION The motivation of the work is to propose the storage capacity can be significantly improved by increasing the code alphabet q or by increasing the textured pattern size. It increases the storage capacity of the classical QR code. It provides security for private message using visual secret sharing scheme. 3. STATE OF ART The paper [1] proves that the contrast of XVCS is times greater than OVCS. The monotone property of OR operation degrades the visual quality of reconstructed image for OR-based VCS (OVCS). Accordingly, XOR-based VCS (XVCS), which uses XOR operation for decoding, was proposed to enhance the contrast. Advantages are: Easily decode the secret image by stacking operation. XVCS has better reconstructed image than OVCS. Disadvantages are: Proposed algorithm is more complicated. In [2] paper, present a blind, key based watermarking technique, which embeds a ISSN:0975-887

transformed binary form of the watermark data into the DWT domain of the cover image and uses a unique image code for the detection of image distortion. The QR code is embedded into the attack resistant HH component of 1stlevel DWT domain of the cover image and to detect malicious interference by an attacker. Advantages are: More information representation per bit change combined with error correction capabilities. Increases the usability of the watermark data and maintains robustness against visually invariant data removal attacks. Disadvantages are: Limited to a LSB bit in the spatial domain of the image intensity values. Since the spatial domain is more susceptible to attacks this cannot be used. In [3] paper, design a secret QR sharing approach to protect the private QR data with a secure and reliable distributed system. The proposed approach differs from related QR code schemes in that it uses the QR characteristics to achieve secret sharing and can resist the print-andscan operation. Advantages are: Reduces the security risk of the secret. Approach is feasible. It provides content readability, cheater detectability, and an adjustable secret payload of the QR barcode. Disadvantages are: Need to improve the security of the QR barcode. QR technique requires reducing the modifications. The two-level QR code (2LQR), has two public and private storage levels and can be used for document authentication [4]. The public level is the same as the standard QR code storage level; therefore it is readable by any classical QR code application. The private level is constructed by replacing the black modules by specific textured patterns. It consists of information encoded using q_ary code with an error correction capacity. Advantages are: It increases the storage capacity of the classical QR code. The textured patterns used in 2LQR sensitivity to the P&S process. Disadvantages are: Need to improve the pattern recognition method. Need to

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 336

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

increase the storage capacity of 2LQR by designed scheme is feasible to hide the replacing the white modules with textured secrets into a tiny QR tag as the purpose of patterns. steganography. Only the authorized user To protect the sensitive data, [5] paper with the private key can further reveal the explores the characteristics of QR concealed secret successfully. barcodes to design a secret hiding Disadvantages are: Need to increase the mechanism for the QR barcode with a security. higher payload compared to the past ones. 4. GAP ANALYSIS For a normal scanner, a browser can only TABLE:GAP ANALYSIS reveal the formal information from the marked QR code. Advantages are: The Sr. No. Author, Title and Journal Technique Advantages Name Used 1 C. N. Yang, D. S. Wang, XOR-based 1. Easily decode the ―Property Analysis of XOR- VCS (XVCS) secret image by stacking Based Visual Cryptography,‖ operation. IEEE Transactions on Circuits 2. XVCS has better & Systems for Video reconstructed image Technology, vol. 24, no. 12 than OVCS. pp. 189-197, 2014. 2

3

4

ISSN:0975-887

P. P. Thulasidharan, M. S. Watermarking Nair, ―QR code based blind technique for digital image watermarking QR code with attack detection code,‖ AEU - International Journal of Electronics and Communications, vol. 69, no. 7, pp. 1074-1084, 2015.

1. More information representation per bit change combined with error correction capabilities. 2. Increases the usability of the watermark data and maintains robustness against visually invariant data removal attacks. P. Y. Lin, ―Distributed Secret A secret QR 1. Reduces the security Sharing Approach with sharing risk of the secret. Cheater Prevention Based on scheme 2. Approach is feasible. QR Code,‖ IEEE Transactions 3. It provides content on Industrial Informatics, vol. readability, cheater 12, no. 1, pp. 384-392, 2016. detectability, and an adjustable secret payload of the QR barcode. I. Tkachenko, W. Puech, C. Two-level QR 1. It increases the Destruel, et al., ―Two-Level code storage capacity of the QR Code for Private Message classical QR code. Sharing and Document 2. The textured patterns Authentication,‖ IEEE used in 2LQR Transactions on Information sensitivity to the P&S Forensics & Security, vol. 11, process. no. 13, pp. 571-583, 2016. Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 337

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

5

P. Y. Lin, Y. H. Chen, ―High Secret hiding payload secret hiding for QR technology for QR codes,‖ barcodes. Eurasip Journal on Image & Video Processing, vol. 2017, no. 1, pp. 14, 2017.

5. PROPOSED WORK In this paper, an innovative scheme is proposed to improve the security of QR codes using the XVCS theory. First, an improved (n, n) sharing method is designed to avoid the security weakness of existing methods. On this basis, consider the method for (k, n) access structures by utilizing the (k, k) sharing instance on every k-participant subset, respectively. This approach will require a large number of instances as n increases. Therefore, presents two division algorithms to classify all the k-participant subsets into several collections, in which instances of multiple subsets can be replaced by only one.  Enhanced (n, n) sharing method  (k, n) sharing method Based on the enhanced (n, n) method, a (k, n) method can be achieved if we apply the (k, k) instance to every k-participant subset of the (k, n) access structure. However, there will be a huge amount of (k, k) instances. Advantages are:  Secure encoding of document or text.  Text steganography for message encoding.  Increases the sharing efficiency.  VCS is low computational complexity.  Higher security and more flexible access structures.  Computation cost is less.  stego synthetic texture for QR code hiding.

ISSN:0975-887

1. The designed scheme is feasible to hide the secrets into a tiny QR tag as the purpose of steganography. 2. Only the authorized user with the private key can further reveal the concealed secret successfully.

Fig 2: Proposed System Architecture

6. MATHEMATICAL MODEL Two collections of Boolean matrices denoted by and consist of an (n, n)XVCS if the following conditions are satisfied:

(2) The first property is contrast, which illustrates that the secret can be recovered by XOR-ing all participant shares. The second property is security, which prevents any k (k < n) participants from gaining any knowledge of the secret. Enhanced (n, n) sharing method

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 338

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

Define two blocks and belong to an identical group G if is satisfied. (3) With above definition, we can divide into several groups . For example, to determine whether and are of a same group, we calculate . If , we can conclude that and are of an identical group, and vice versa. A block different from any other blocks will not be contained in any group. is said to be responsible for if is reversed to share . Let denote the case that is responsible for and let represent the opposite. A matrix X is constructed by solving (1).

(4)

If n satisfies the condition , there must be a solution to (1) when In addition, we can adjust the value of to balance errors between the covers and the reconstructed secret. Based on X, we design a new sharing algorithm. 7. CONCLUSION In this paper, we proposed a visual secret sharing scheme for QR code applications, ISSN:0975-887

which makes improvement mainly on two aspects: higher security and more flexible access structures. In addition, we extended the access structure from (n, n) to (k, n) by further investigating the error correction mechanism of QR codes. Two division approaches are provided, effectively improving the sharing efficiency of (k, n) method. Therefore, the computational cost of our work is much smaller than that of the previous studies which can also achieve (k, n) sharing method. The future work will make the QR code reader for scanned QR code within fraction of seconds. REFERENCES [1] C. N. Yang, D. S. Wang, ―Property Analysis of XOR-Based Visual Cryptography,‖ IEEE Transactions on Circuits & Systems for Video Technology, vol. 24, no. 12 pp. 189-197, 2014. [2] P. P. Thulasidharan, M. S. Nair, ―QR code based blind digital image watermarking with attack detection code,‖ AEU - International Journal of Electronics and Communications, vol. 69, no. 7, pp. 1074-1084, 2015. [3] P. Y. Lin, ―Distributed Secret Sharing Approach with Cheater Prevention Based on QR Code,‖ IEEE Transactions on Industrial Informatics, vol. 12, no. 1, pp. 384-392, 2016. [4] I. Tkachenko, W. Puech, C. Destruel, et al., ―Two-Level QR Code for Private Message Sharing and Document Authentication,‖ IEEE Transactions on Information Forensics & Security, vol. 11, no. 13, pp. 571-583, 2016. [5] P. Y. Lin, Y. H. Chen, ―High payload secret hiding technology for QR codes,‖ Eurasip Journal on Image & Video Processing, vol. 2017, no. 1, pp. 14, 2017. [6] https://en.wikipedia.org/wiki/QR_code [7] F. Liu, Guo T: Privacy protection display implementation method based on visual passwords. CN Patent App. CN 201410542752, 2015. [8] S J Shyu, M C Chen, ―Minimizing Pixel Expansion in Visual Cryptographic Scheme for General Access Structures,‖ IEEE Transactions on Circuits & Systems for Video Technology, vol. 25, no. 9, pp.1-1,2015. [9] H. D. Yuan, ―Secret sharing with multi-cover adaptive steganography,‖ Information Sciences, vol. 254, pp. 197–212, 2014. [10] J. Weir, W. Q. Yan, ―Authenticating Visual Cryptography Shares Using 2D Barcodes,‖ in Digital Forensics and Watermarking. Berlin, German: Springer Berlin Heidelberg, 2011, pp. 196-210.

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 339

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

VERIFYING THE INTEGRITY OF DIGITAL FILES USING DECENTRALIZED TIMESTAMPING ON THE BLOCKCHAIN Akash Dhande1, Anuj Jain2, Tejas Jain3, Tushar Mhaslekar4, Prof. P. N. Railkar5, Jigyasa Chadha6 1,2,3,4,5

Dept of Computer Engineering, Smt. Kashibai Navale College of Engineering, Pune, India. 6 Department of ECE, HMR Institute of Technology and Managmenet Delhi [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

ABSTRACT In today's day and age, the integrity and the authenticity of the digital files and document is a critical issue. Especially if those digital files are to be submitted as the evidenc in court. For example, a video file of an accident. Fakers can exploit such files by editing the video and leading the court into the wrong judgement. Therefore, this paper proposes a system to prove the integrity of a digital file such as video proof of accident in the above example. The complete system consists of three functions, one for calculating and storing the hash value of the digital file and its details, second for proving the integrity of the given file by comparing it with stored hash and its timestamp and the third function is for storing and retrieving the original file stored on the InterPlanetary File System (IPFS) network. In this approach, one can store the integrity of a file and can use it to prove the authenticity of that file by comparing the hash of another file with the stored hash. This paper proposes a system that uses the new and emerging technology of Blockchain to secure the integrity of the digital files. Keywords Decentralized, Blockchain, Timestamping, IPFS inexpensively. A blockchain database is 1. INTRODUCTION The proposed system uses the new and managed autonomously using a peer-toemerging technologies like Blockchain to peer network and a distributed store the hash of the files uploaded by user which will be the integrity of that file and timestamping [7]. They are authenticated IPFS (InterPlanetary File System) by mass collaboration powered by Network which will be used to store the collective self-interests. The result is a original files uploaded along with the robust workflow where participants' trusted timestamping and the location of uncertainty regarding data security is the file from where it was uploaded. marginal. So essentially a blockchain is a distributed ledger which cannot be Blockchain is a decentralized ledger or tampered with ensuring the security of the data structure. It can be referred as blocks data stored in it. in a chain where the corresponding blocks refer to the blocks, prior to them [5]. A IPFS (InterPlanetary File System) is a blockchain is a decentralized, distributed peer-to-peer distributed file system that and public digital ledger that is used to seeks to connect all computing devices record transactions across many computers with the same system of files [1]. In some so that the record cannot be altered ways, IPFS is similar to the World Wide retroactively without the alteration of all Web, but IPFS could be seen as a single subsequent blocks and the consensus of BitTorrent swarm, exchanging objects the network. This allows the participants to within one Git repository. In other words, verify and audit transactions ISSN:0975-887

Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune.

Page 340

Proceedings of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019

IPFS provides a high- throughput, contentaddressed block storage model, with content-addressed hyperlinks. This forms a generalized Merkle directed acyclic graph (DAG). IPFS combines a distributed hash table, an incentivized block exchange, and a selfcertifying namespace. IPFS has no single point of failure, and nodes do not need to

trust each other not to tamper with data in transit. Distributed Content Delivery saves bandwidth and prevents DDoS attacks, which HTTP struggles with. Section 2 contains State of Art, Section 3 contains Gap Analysis, Section 4 User Classes and Characteristics, Section 5 contains Proposed Work, Section 6 contains Conclusion and Future Work and the Section 7 contains References.

2. STATE OF ART The papers [2][3][4] talk about timestamping which is used in cryptocurrencies like bitcoin. This system was proposed by Norman Neuschke & Andre Gernnandt in 2015. System uses hash of digital data and can be used to record the transactions into the blocks. Also. the papers [5][7] was proposed by Rishav and Rajdeep Chaterjee in 2017. It consists of detailed implementation of blockchain technology and its use-cases includes transactions of multiple parties based on Hyperledger. The paper [1] consists copyright management system based on digital watermarking includes blockchain perceptual hash function, quick response code (QR), InterPlanetary File System (IPFS) related work to compare copyrights of digital files. IPFS is used to store and distribute watermarked images

without a centralized server. This scheme can enhance the effectiveness of digital watermarking technology in the field of copyright protection. The concept of digital watermarking was introduced by Meng Zhaoxiong Morizumi Tetsuya in 2018. The system in paper [8] is a peer-to-peer distributed file system that seeks to connect all computing devices with the same system of files. IPFS combines a distributed hashtable, an incentivized block exchange, and a self-certifying namespace. IPFS has no single point of failure, and nodes do not need to trust each other. This concept was introduced by Juan Benet. The paper [9] introducing blockchain, which is a form of database storage that is noncentralized, reliable, and difficult to use for fraudulent purposes. Transactions are made wit