Merge ~wisniak199/dpcs:classification_A_18 into dpcs:master
Proposed by
Piotr Wiśniewski
Status: | Merged |
---|---|
Approved by: | Marek Bardoński |
Approved revision: | 6be75f7d68f4f6a81074fd53ce5a957e01680ef0 |
Merge reported by: | Marek Bardoński |
Merged at revision: | not available |
Proposed branch: | ~wisniak199/dpcs:classification_A_18 |
Merge into: | dpcs:master |
Diff against target: |
4 lines (+0/-0) 0 files modified
|
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Marek Bardoński | Approve | ||
Review via email: mp+288225@code.launchpad.net |
To post a comment you must log in.
Hi,
Very good job. I've some comments for you:
* We probably can't use the NN directly because of the scale of the problem. It would be super slow. I think if we substitute NN with DL (deep learning) which is a kind of a deep neural network that's working better on a large amount of data.
* a dictionary which consists of the most common English words is called stopwords dataset. It' already implemented in scientific libraries.
* common words in programming like 'python' etc are called domain. They are used in Inverse Domain Frequency techniques to extract the most important words.
My few ideas here is that maybe this conception should be used in a package-scope world? Or maybe we should use a two step algorithm to firstly select the package?