e-Libs >> TRUNOJOYO Library

Perpustakaan sebagai jantung pendidikan tinggi di Indonesia, harus mampu memberi kontribusi yang berarti bagi pelaksanaan proses belajar mengajar di perguruan tinggi.

TRUNOJOYO » Tugas Akhir & Skripsi » Informatika
di-posting oleh 170411100107 pada 2022-08-23 09:08:50 • 276 klik

TOPIC MODELLING MENGGUNAKAN DATA TWITTER DENGAN METODE LATENT SEMANTIC ANALYSIS
TOPIC MODELLING USING TWITTER DATA WITH LATENT SEMANTIC ANALYSIS METHOD

disusun oleh MOH. ROMADHANI FIRDAUS

Subyek:	Penerapan Text Mining pada saat pandemi COVID19 dengan menggunakan data Twitter
Kata Kunci:	Text Mining Topic Modeling LSA

[ Anotasi Abstrak ]

Pandemi Covid-19 melanda seluruh dunia tanpa terkecuali Indonesia, dengan angka kematian yang tinggi maka hal ini menimbulkan banyak tanggapan dari masyarakat yang mereka sampaikan melalui sosial media selama pandemi. Twitter merupakan salah satu sosial media yang digunakan untuk menyampaikan tanggapan berupa tweet. Namun dengan banyaknya tanggapan, diperlukan waktu lebih untuk menentukan apa saja informasi penting didalamnya. Untuk itu pada penelitian ini dikembangkan sistem yang mengidentifikasi topik pada tweet, topik tersebut berupa tanggapan masyarakat dari sosial media twitter. Identifikasi topik dalam text mining disebut Topic Modelling, pada skripsi ini tahapan dari Topic Modeling terdiri dari Crawling Data, Text Preprocessing pada bagian Cleansing terdiri dari (Case Folding, Tokenizing, Stopword Removal, dan Stemming), pada bagian pembobotan kata menggunakan Term Frequency-Inverse Document Frequency (TF-IDF). Text Preprocessing pada bagian seleksi fitur menggunakan Pearson Correlation dengan threshold lebih dari 0.6 dengan tujuan untuk mengurangi jumlah term yang memiliki nilai korelasi dibawah 0.6. Topic Modeling dengan algoritma Latent Semantic Analysis dengan output berupa keyword yang dihasilkan kemudian diberi label oleh pakar, evaluasi topik menggunakan Topic Coherence semakin mendekati 1, maka semakin bagus kualitas topik yang dihasilkan. Dari skenario uji coba yang dilakukan terhadap nilai threshold 0.6 – 0.9 dengan data hasil crawling “vaksinasi” dan “kebijakan vaksinasi” ditemukan hanya satu topik dengan label “Politik” dan Coherence Score yang berbeda yaitu, untuk nilai threshold 0.6 dengan Coherence score 0.0004955403009978006, untuk nilai threshold 0.7 dengan Coherence score 0, sedangkan nilai threshold 0.8 dan 0.9 menghasilkan nilai Coherence score yang sama yaitu 0.

Deskripsi Lain

The Covid-19 pandemic has hit the whole world without exception Indonesia, with a high death rate, this has led to many responses from the public which they conveyed through social media during the pandemic. Twitter is one of the social media used to convey responses in the form of tweets. However, with so many responses, it takes more time to determine what the important information is. For this reason, in this study a system was developed that identifies topics in tweets, these topics are in the form of public responses from Twitter social media. Identification of topics in text mining is called Topic Modeling, in this thesis the stages of Topic Modeling consist of Crawling Data, Text Preprocessing in the Cleansing section consisting of (Case Folding, Tokenizing, Stopword Removal, and Stemming), in the word weighting section using Term Frequency- Inverse Document Frequency (TF-IDF). Text Preprocessing in the feature selection section uses Pearson Correlation with a threshold of more than 0.6 with the aim of reducing the number of terms that have a correlation value below 0.6. Topic Modeling with Latent Semantic Analysis algorithm with output in the form of keywords generated and then labeled by experts, topic evaluation using Topic Coherence is closer to 1, the better the quality of the resulting topics. From the test scenarios carried out on the threshold value of 0.6 – 0.9 with the crawled data of “vaccination” and “vaccination policy”, only one topic with the label “Politics” and a different Coherence Score was found, namely, for a threshold value of 0.6 with a Coherence score 0.0004955403009978006, for the threshold value of 0.7 with a Coherence score 0, while the threshold values of 0.8 and 0.9 produce the same Coherence score of 0.

Kontributor	: Mula’ab. S.Si., M.Kom Dr.Wahyudi Setiawan, S.Kom, M.Kom
Tanggal tercipta	: 2022-02-22
Jenis(Tipe)	: Text
Bentuk(Format)	: pdf
Bahasa	: Indonesia
Pengenal(Identifier)	: TRUNOJOYO-Tugas Akhir-25531
No Koleksi	: 170411100107

Sumber :
https://github.com/danitkj2bangkalan/TopicModeling

Relasi/Tautan:
https://github.com/mulaab

Cakupan (Coverage) :
Text Mining

Ketentuan (Rights) :
Bersifat Open Source

Download File Penyerta (khusus anggota terdaftar)

File PDF

1. TRUNOJOYO-Tugas Akhir-25531-ABSTRAK.pdf - 80 KB
File PDF

2. TRUNOJOYO-Tugas Akhir-25531-COVER.pdf - 436 KB
File PDF

3. TRUNOJOYO-Tugas Akhir-25531-BAB I.pdf - 102 KB
File PDF

4. TRUNOJOYO-Tugas Akhir-25531-BAB II.pdf - 416 KB
File PDF

5. TRUNOJOYO-Tugas Akhir-25531-BAB III.pdf - 344 KB
File PDF

6. TRUNOJOYO-Tugas Akhir-25531-BAB IV.pdf - 744 KB
File PDF

7. TRUNOJOYO-Tugas Akhir-25531-BAB V.pdf - 81 KB
File PDF

8. TRUNOJOYO-Tugas Akhir-25531-Daftar Pustaka.pdf - 98 KB
File PDF

9. TRUNOJOYO-Tugas Akhir-25531-Lampiran.pdf - 449 KB

Dokumen sejenis...

Tidak ada !

Dokumen yang bertautan...

Kembali ke Daftar