A two day Workshop featuring Tika, Solr, Cassandra, Hadoop and Mahout
Date: 16 -17 June 2012 | Time: 9:30 - 17:30 | Contact: This e-mail address is being protected from spambots. You need JavaScript enabled to view it | Helpline: +91-9449804064
Venue: Madhu Infotech, #106, HVM House, Amarjyothi Layout, Domlur, Bangalore (near Aadhishwar Showroom)
Registration Fee (Regular): Rs. 3000 (for Corporate) / Rs. 2500 (for Individual) | Early Bird (till 8 June 2012): Rs. 2000 (for Corporate) / Rs. 1500 (for Individual)
Objective
- Content Extraction (hands-on using Apache Tika)
- Distribute Content in NOSQL ways (hands-on using Cassandra, Neo4j)
- Search and Indexing (hands-on using Solr and Tika)
- Distributed computing and analysis using Hadoop MapReduce and Mahout (hands-on using Hadoop MapReduce, Mahout)
Register Now! Only 30 Nodes in our Cluster
Delivered by Members of Technology Peer Group
(a group of technology experts from across the organizations)
With relevant and intensive hands-on sessions
Motivation
- Advances in digital sensors, communications, computation, and storage have created huge collections of data, capturing information of value to business, science, government, and society.
- Businesses and industry are increasingly interested in leveraging the data they capture through business processes.
Who should attend
- Professionals who are keen to understand and get familiar with topics involved in Big Data space handling and processing huge amounts of data in diverse formats and structure.
- Professionals who are interested in getting hands-on experience with some of the leading open source packages in this topic
Registration Procedure
- Confirm your registration by emailing to This e-mail address is being protected from spambots. You need JavaScript enabled to view it along with the details of participant’s profile and payment.
Payment Options
Account Name: Ram Software Engineering Labs Pvt Ltd | Bank Name: ICICI Bank | IFSC Code: ICIC0000047
Bank Account Number: 004705009040 | Account Type : Current Account | Beneficiary Bank Address: Koramangala Branch, Bangalore
Agenda - Day 1
Content Extraction using Apache Tika
- Characteristics and Challenges of Content Extraction systems
- Overview of Tika
- Working with Tika core and Tika Parser library to extract meta data and content from “rich documents”
Storing meta data and content in NoSQL
- Overview of NoSQL
- Compare and contrast various NoSQL packages
- Overview of Cassandra and Neo4j
- Demonstration and hands-on in Cassandra
Searching using Apache Solr
- Overview of Search Engines
- Working with Solr—the search Engine
- Integrating Tika, Solr and Cassandra to build a scalable extraction and retrieval system for unstructured dataEngineering Graduate from Coimbatore
Agenda - Day 2
Hadoop MapReduce
- Hadoop MapReduce
- Installing and configuring Hadoop (Single node and Cluster Setup)
- Our first Map-Reduce job on Hadoop
- Payload, Task Execution & Environment, Job Management
- Our fully distributed Map Reduce job on Hadoop
Machine Learning using Mahout
- Overview of Machine Learning
- Setting up and configuring Mahout
- Mahout Background—Matrix and Vector Needs, Collection(De-)Serialization
- Mahout—collections
- Overview of Machine Learning Algorithms
- Mahout recommendation engine running on Hadoop cluster
- Comparison of Mahout with other Machine Learning packages
About the Trainers
Manish Bafna
- Engineering Graduate from Coimbatore
- 8+ yrs experience in Java, j2ee, open source technologies
- Worked on many open-sources including Camel, Jcifs, Cassandra, Tika, Solr, Hadoop, Mahout etc.
- Currently working as Associate Director at Exterro, Coimbatore
- Deal with eDiscovery domain and work on projects related to collection and processing of documents
- Handle data in range of TB, PB
- Have experience in collecting documents and processing from Workstations (desktop), network share, sharepoint, exchange, lotus notes, Mail systems(Gmail)
Ajit Koti
- Graduate from V T U, Belgaum
- 8+ yrs experience in Java, j2ee, open sources
- Worked on many open-sources including Solr, Hadoop, Mahout, Neo4j , Gigaspace etc.
- Currently working as Senior Software Engineer at IBM Labs, Bangalore
- Have experience in Machine Learning Algorithms, Distributing Computing and Cloud Computing.
Arunkumar Krishnamoorthy
- Computer Science Engineering Graduate from BITS, Pilani
- 12+ years of experience in Software Engineering, Product Development and Management
- Worked on building number of applications end-to-end on diverse technologies
- Started Ram Software Engineering Labs in 2010 focusing on R&D in various areas of Software development
- Conducted several corporate workshops on different topics including conventional Java stack (core Java - JEE - Spring - Hibernate), OOAD+UML, EAI (Fuse ESB, Camel, OSGi), Maven, XML series, Web Services, Professional Software development


