ISSUES . APACHE . ORG {}

Analyzed Page
Matching Content Categories
CMS
Monthly Traffic Estimate
How Does Issues.apache.org Make Money
Keywords
Topics
Questions
External Links
Libraries
CDN Services

We are analyzing https://issues.apache.org/jira/browse/hive-417.

Title:
[HIVE-417] Implement Indexing in Hive - ASF JIRA
Description:
No description found...
Website Age:
30 years and 2 months (reg. 1995-04-11).

Matching Content Categories {📚}

Cryptocurrency
Books & Literature
Personal Finance

Content Management System {📝}

What CMS is issues.apache.org built with?

Custom-built

No common CMS systems were detected on Issues.apache.org, and no known web development framework was identified.

Traffic Estimate {📈}

What is the average monthly size of issues.apache.org audience?

🚀 Good Traffic: 50k - 100k visitors per month

Based on our best estimate, this website will receive around 50,019 visitors per month in the current month.
However, some sources were not loaded, we suggest to reload the page to get complete results.

check SE Ranking
check Ahrefs
check Similarweb
check Ubersuggest
check Semrush

How Does Issues.apache.org Make Money? {💸}

We can't figure out the monetization strategy.

Not every website is profit-driven; some are created to spread information or serve as an online presence. Websites can be made for many reasons. This could be one of them. Issues.apache.org has a secret sauce for making money, but we can't detect it yet.

Keywords {🔍}

index, table, added, comment, jul, query, hive, yongqiang, file, data, col, key, prasad, create, code, patch, indexes, queries, select, base, rewrite, john, chakka, based, columns, indexing, sort, sichi, offsets, support, jun, order, time, block, sorted, case, column, add, rows, design, dont, set, timestamp, files, comments, tables, offset, sparse, part, jira,

Topics {✒️}

mentioned optimization ql/src/java/org/apache/hadoop/hive/ql/rewrite/rules/hiverwrule java ql/src/java/org/apache/hadoop/hive/ql/rewrite/rules/hiverwrulecontext java ql/src/java/org/apache/hadoop/hive/ql/parse/qbparseinfo java ql/src/java/org/apache/hadoop/hive/ql/parse/semanticanalyzer java ql/src/java/org/apache/hadoop/hive/ql/rewrite/hiverewriteengine ql/src/java/org/apache/hadoop/hive/ql/index/compact/compactindexhandler ql/src/java/org/apache/hadoop/hive/ql/metadata/table java ql/src/test/queries/clientpositive/ql_rewrite_gbtoidx ql/src/test/results/clientpositive/ql_rewrite_gbtoidx org/hadoop/hive/indexdev yongqiang run-time map-red operator /user/hive/warehouse/src_rc_index inputformat /compsci/project_spotlight/datamgmt/sigmod2003 sort-based multi-column index db2 multi-dimensional clustering light-weight rewrite support rewrites toplevel qb /user/ecapriolo/hivetest4 insert overwrite directory make hadoop-hive suitable public getter/setter methods rewrite engine full-fledged index support relationships connecting tbls/sds hadoop-related projects apache-git repository map-side join based hive ql part time-consuming mapreduce phase brute force table-scans java – hasindex intial rewrite rule single-column index based src combine kv-pairs rewrite boolean flag jeff hammerbacher added semantic analysis phase hive ql mysql metastore upgrade gbtoidx rewrite rule blocks/map-tasks things schubert zhang added namit jain added hiveql related stuff min-max range report back errors metastore object model john sichi added avoiding outofmemory exceptions

Questions {❓}

1) Are you worried about the sort phase of the reducer or the IndexBuilder's reducer code?
2) What are the other options for the index output format?
2)What are the other options for the index output format?
>>Also, how would this be used to query the table?
>>For this, we'll need to add property key/values to the grammar (IDXPROPERTIES like TBLPROPERTIES and SERDEPROPERTIES?
>>are we going to have one index file per hdfs file?
>>why is virtualcolumn class in the serde2 package?
Also, how is the file name populated?
Also, how would this be used to query the table?
Any chance you guys could post a more detailed design document for "full-fledged index support"?
Can you think of any other ways?
Did my comment make it seem like I thought otherwise?
For this, we'll need to add property key/values to the grammar (IDXPROPERTIES like TBLPROPERTIES and SERDEPROPERTIES?
Have you started working on this one ?
How should we model this in such a way that it takes per-partition indexing into account?
If the size of index is big out of memory size, how to read whole index into memory?
Is that correct?
Is the idea here to select from the index an then pass the offsets to another query to look up the table?
Is the index based on sort?
Is the partitioning for the index independent of the partitioning for the table?
OR predicates to union (for efficiency?
One initial question I had - why is virtualcolumn class in the serde2 package?
One of my suggestions would be that, since we've done indexing with Mapreduce, and for some queries based on the generated indexes, can we just omit the time-consuming Mapreduce phase during the querying period, as we've already got all of the files/offsets and we can go to these specific file offsets directly to get relevant rows of the table?
Or can we piggyback this sorting on top of hadoop reduce sort phase some how?
Prasad, you said you already wrote some code, would you please attach it?
Prasad,the index you designed is a kind of hash index?
Should we also consider them?
Since the code is in a prototype stage, can we move the index code to contrib ?
Thoughts?
Are there any references on this technique?
Are we going to have one index file per hdfs file?
Com/View?
Creation of SUMMARY index table ?
Patch is not working, can you share latest patch here ?
Patch, but I don't see the virtual columns in there?
Related question is how this is going to interact with sampling?
These seperate index files for each HDFS file, can be expressed as a single table in Hive?
What i am trying to say is for such frequent keys indexing may not be of much help so may be we can relax 'sort' property?

External Links {🔗}(13)

Libraries {📚}

Fancybox
Foundation
jQuery
Moment.js
Raphael

CDN Services {📦}

Static

5.58s.