Subset source coding

Ebrahim Molavianjazi, Aylin Yener

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    Emerging applications including semantic information processing impose priorities on the possible realizations of information sources, so that not all source sequences are important. This paper proposes an initial framework for optimal lossless compression of subsets of the output of a discrete memoryless source (DMS). It turns out that, the optimal source code may not index the conventional source-typical sequences, but rather index certain subset-typical sequences determined by the source statistics as well as the subset structure. Building upon an achievability and a strong converse, an analytic expression is given, based on the Shannon entropy, relative entropy, and subset entropy, which identifies such subset-typical sequences for a broad class of subsets of a DMS. Interestingly, one often achieves a gain in the fundamental limit, in that the optimal compression rate for the subset can be strictly smaller than the source entropy, although this is not always the case.

    Original languageEnglish (US)
    Title of host publication2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages857-864
    Number of pages8
    ISBN (Electronic)9781509018239
    DOIs
    StatePublished - Apr 4 2016
    Event53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015 - Monticello, United States
    Duration: Sep 29 2015Oct 2 2015

    Other

    Other53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015
    CountryUnited States
    CityMonticello
    Period9/29/1510/2/15

    Fingerprint

    Entropy
    Set theory
    Semantics
    Statistics

    All Science Journal Classification (ASJC) codes

    • Computer Networks and Communications
    • Computer Science Applications
    • Control and Systems Engineering

    Cite this

    Molavianjazi, E., & Yener, A. (2016). Subset source coding. In 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015 (pp. 857-864). [7447096] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ALLERTON.2015.7447096
    Molavianjazi, Ebrahim ; Yener, Aylin. / Subset source coding. 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 857-864
    @inproceedings{1bd4b02ad9e6423b8603960ef5a488a7,
    title = "Subset source coding",
    abstract = "Emerging applications including semantic information processing impose priorities on the possible realizations of information sources, so that not all source sequences are important. This paper proposes an initial framework for optimal lossless compression of subsets of the output of a discrete memoryless source (DMS). It turns out that, the optimal source code may not index the conventional source-typical sequences, but rather index certain subset-typical sequences determined by the source statistics as well as the subset structure. Building upon an achievability and a strong converse, an analytic expression is given, based on the Shannon entropy, relative entropy, and subset entropy, which identifies such subset-typical sequences for a broad class of subsets of a DMS. Interestingly, one often achieves a gain in the fundamental limit, in that the optimal compression rate for the subset can be strictly smaller than the source entropy, although this is not always the case.",
    author = "Ebrahim Molavianjazi and Aylin Yener",
    year = "2016",
    month = "4",
    day = "4",
    doi = "10.1109/ALLERTON.2015.7447096",
    language = "English (US)",
    pages = "857--864",
    booktitle = "2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",
    address = "United States",

    }

    Molavianjazi, E & Yener, A 2016, Subset source coding. in 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015., 7447096, Institute of Electrical and Electronics Engineers Inc., pp. 857-864, 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015, Monticello, United States, 9/29/15. https://doi.org/10.1109/ALLERTON.2015.7447096

    Subset source coding. / Molavianjazi, Ebrahim; Yener, Aylin.

    2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015. Institute of Electrical and Electronics Engineers Inc., 2016. p. 857-864 7447096.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    TY - GEN

    T1 - Subset source coding

    AU - Molavianjazi, Ebrahim

    AU - Yener, Aylin

    PY - 2016/4/4

    Y1 - 2016/4/4

    N2 - Emerging applications including semantic information processing impose priorities on the possible realizations of information sources, so that not all source sequences are important. This paper proposes an initial framework for optimal lossless compression of subsets of the output of a discrete memoryless source (DMS). It turns out that, the optimal source code may not index the conventional source-typical sequences, but rather index certain subset-typical sequences determined by the source statistics as well as the subset structure. Building upon an achievability and a strong converse, an analytic expression is given, based on the Shannon entropy, relative entropy, and subset entropy, which identifies such subset-typical sequences for a broad class of subsets of a DMS. Interestingly, one often achieves a gain in the fundamental limit, in that the optimal compression rate for the subset can be strictly smaller than the source entropy, although this is not always the case.

    AB - Emerging applications including semantic information processing impose priorities on the possible realizations of information sources, so that not all source sequences are important. This paper proposes an initial framework for optimal lossless compression of subsets of the output of a discrete memoryless source (DMS). It turns out that, the optimal source code may not index the conventional source-typical sequences, but rather index certain subset-typical sequences determined by the source statistics as well as the subset structure. Building upon an achievability and a strong converse, an analytic expression is given, based on the Shannon entropy, relative entropy, and subset entropy, which identifies such subset-typical sequences for a broad class of subsets of a DMS. Interestingly, one often achieves a gain in the fundamental limit, in that the optimal compression rate for the subset can be strictly smaller than the source entropy, although this is not always the case.

    UR - http://www.scopus.com/inward/record.url?scp=84969754425&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84969754425&partnerID=8YFLogxK

    U2 - 10.1109/ALLERTON.2015.7447096

    DO - 10.1109/ALLERTON.2015.7447096

    M3 - Conference contribution

    SP - 857

    EP - 864

    BT - 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -

    Molavianjazi E, Yener A. Subset source coding. In 2015 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015. Institute of Electrical and Electronics Engineers Inc. 2016. p. 857-864. 7447096 https://doi.org/10.1109/ALLERTON.2015.7447096